Home » Statistics » How to Calculate Rolling Correlation in Python

How to Calculate Rolling Correlation in Python

Correlation are very much helpful for model prediction or forecasting if there is relationship exists.

Rolling correlations are correlations between two time series as a rolling window calculation.

With the help of rolling correlations, you can determine if there two correlated time series are deviated from each other over time and other advantage is it helps you to identify shift in trend as time series.

In this tutorial, we will discuss about how to calculate rolling correlation in python using numpy and rolling correlation pandas package and step by step examples.

pip install pandas

If you don’t have Pandas package installed on your system, use below command in window command prompt

pip3 install pandas

How to calculate Rolling Correlations in Python

We will be using Pandas package to calculate rolling correlations. It provides rolling.corr() function for calculation.

Pandas package in python is written for data manipulation and data analysis. It provides many functions to perform numerical data operation and time series.

Lets understand rolling correlation with the help of example as given below

Lets consider, a company has two products X and Y. Given below are the sales of Product X and Product Y over the 12 months.

MonthProduct XProduct Y
1200350
2225390
3240400
4260390
5220400
6270430
7300500
8280510
9310525
10330550
11310500
12250390
Product Sales Matrix

with the above given details, calculate

a. calculate 3 months rolling correlation between Product X and Product Y

b. calculate 6 months rolling correlation between Product X and Product Y

Calculate Rolling Calculation for 3 months

Use Python code to create dataframe for Product X and Product Y and calculating rolling correlation for 3 months duration as below

import pandas as pd
import numpy as np

#Create DataFrame

salexmatrix = pd.DataFrame({'month':np.arange(1,13),
                   'x':[200,225,240,260,220,270,300,280,310,330,310,250],
                   'y':[350,390,400,390,400,430,500,510,525,550,500,390]})

#View first 5 rows of DataFrame

print(salexmatrix.head())

#Calculate 3-months rolling correlation between sales for product x and product y

print(salexmatrix['x'].rolling(3).corr(salexmatrix['y']))

In the above python code, we have used rolling correlation pandas package function rolling.corr() for calculation. It returns output as below

   month    x    y
0      1  200  350
1      2  225  390
2      3  240  400
3      4  260  390
4      5  220  400
0          NaN
1          NaN
2     0.981981
3    -0.082199
4    -0.866025
5     0.453921
6     0.932216
7     0.675845
8     0.433555
9     0.966823
10    0.866025
11    0.997701

In output, first it print top 5 rows of dataframe. It returns 3 month rolling correlation between Product X and Product Y. Based on above output we can conclude that,

Correlation between X & Y Sales over 1 to 3 months are : 0.981981

Correlation between X & Y Sales over 2 to 4 months are : -0.082199

and like this for all other rolling window of 3 months duration.

Calculate Rolling Calculation for 6 months

now, as if we want to calculate rolling correlation over 6 months rolling window, using same sales product sales matrix, use below code

import pandas as pd
import numpy as np

#Create DataFrame

salexmatrix = pd.DataFrame({'month':np.arange(1,13),
                   'x':[200,225,240,260,220,270,300,280,310,330,310,250],
                   'y':[350,390,400,390,400,430,500,510,525,550,500,390]})

#View first 5 rows of DataFrame

print(salexmatrix.head())

#Calculate 3-months rollig correlation between sales for product x and product y

print(salexmatrix['x'].rolling(6).corr(salexmatrix['y']))

Above python code returns rolling correlations between sales of Product X and Product Y over 6 months rolling window as below

   month    x    y
0      1  200  350
1      2  225  390
2      3  240  400
3      4  260  390
4      5  220  400
0          NaN
1          NaN
2          NaN
3          NaN
4          NaN
5     0.794638
6     0.848152
7     0.800926
8     0.840528
9     0.930323
10    0.828476
11    0.900379

Based on above output , we can conclude that,

Correlations between Product X & Y for 1 to 6 months is : 0.794638

Correlations between Product X & Y for 2 to 7 months is : 0.848152

and like this for other rolling window of 6 months duration.

Conclusion

I hope, you may find tutorial on how to calculate rolling correlation in python educational and helpful. Rolling correlation pandas package used for calculation.