Code
import pandas as pdArif Qodari
January 30, 2022
Given a data with a column named n that contains a list of integer numbers. Calculate a moving average with a window size = 3!
There is a function pandas.DataFrame.rolling that we can use to do moving window calculation. In this particular case, we will calculate mean in each window.
| n | moving_average | |
|---|---|---|
| 0 | 20 | NaN |
| 1 | 44 | NaN |
| 2 | 46 | 36.666667 |
| 3 | 23 | 37.666667 |
| 4 | 10 | 26.333333 |
| 5 | 80 | 37.666667 |
| 6 | 52 | 47.333333 |
| 7 | 64 | 65.333333 |
| 8 | 87 | 67.666667 |
| 9 | 77 | 76.000000 |
OK. So far so good. In the above code, we calculate a moving average using window size = 3 and then assign the result in a new column named moving_average.
The first two rows are NaN because the rolling function look the values backward. For example when using window size = 3, each window calculate values from position current row - 2 until current row. Hence, the NaN values are there because we can’t calculate mean from less than 3 (window size).
What if what we want is to calculate a forward-looking moving average? That means a window must contain values from position current row until the next 2 rows ahead before calculating the mean. To do that we need a special window indexer called pandas.api.indexers.FixedForwardWindowIndexer.
Here is what we can do.
| n | moving_average | |
|---|---|---|
| 0 | 20 | 36.666667 |
| 1 | 44 | 37.666667 |
| 2 | 46 | 26.333333 |
| 3 | 23 | 37.666667 |
| 4 | 10 | 47.333333 |
| 5 | 80 | 65.333333 |
| 6 | 52 | 67.666667 |
| 7 | 64 | 76.000000 |
| 8 | 87 | NaN |
| 9 | 77 | NaN |
The code is similar with the previous one, but now we specify a window parameter in rolling function using a special window indexer function.
Then, Voila! The result is as expected.