Code
import pandas as pd
Arif Qodari
January 30, 2022
Given a data with a column named n
that contains a list of integer numbers. Calculate a moving average with a window size = 3!
There is a function pandas.DataFrame.rolling that we can use to do moving window calculation. In this particular case, we will calculate mean in each window.
n | moving_average | |
---|---|---|
0 | 20 | NaN |
1 | 44 | NaN |
2 | 46 | 36.666667 |
3 | 23 | 37.666667 |
4 | 10 | 26.333333 |
5 | 80 | 37.666667 |
6 | 52 | 47.333333 |
7 | 64 | 65.333333 |
8 | 87 | 67.666667 |
9 | 77 | 76.000000 |
OK. So far so good. In the above code, we calculate a moving average using window size = 3 and then assign the result in a new column named moving_average
.
The first two rows are NaN because the rolling
function look the values backward. For example when using window size = 3, each window calculate values from position current row - 2 until current row. Hence, the NaN values are there because we can’t calculate mean from less than 3 (window size).
What if what we want is to calculate a forward-looking moving average? That means a window must contain values from position current row until the next 2 rows ahead before calculating the mean. To do that we need a special window indexer called pandas.api.indexers.FixedForwardWindowIndexer.
Here is what we can do.
n | moving_average | |
---|---|---|
0 | 20 | 36.666667 |
1 | 44 | 37.666667 |
2 | 46 | 26.333333 |
3 | 23 | 37.666667 |
4 | 10 | 47.333333 |
5 | 80 | 65.333333 |
6 | 52 | 67.666667 |
7 | 64 | 76.000000 |
8 | 87 | NaN |
9 | 77 | NaN |
The code is similar with the previous one, but now we specify a window parameter in rolling
function using a special window indexer function.
Then, Voila! The result is as expected.