First of all apologies for the long post. Hope someone can offer some advice.
I have about 200 columns of time series data that I need to perform a correlation analysis on in terms calculating lagged cross correlations between all the variables.
Currently the data is stored in Excel. Variable 1 is in Column A, variable 2 in Column B etc, the data ends in column GR which contains the last variable. I need to calculate the cross correlations between variable 1 in Column A and all other variables in the other remaining columns. The key thing here is that I need to calculate lagged as well as contemperaous correlations between each of the variables.
Then I want to do the same for the next variable in Column B, i.e. calculate contemperaous and lagged correlations between variable 2 and all other variables.
Is there anyway to automate this so that it work out the correlations up to say 10 lags of for each variable against all other variables and paste the results in some kind of readable grid in an excel sheet for each variable:
For example, for Column A - Variable 1 the results could be shown like this (for the first five variables):
B C D E F
Lag0 X X X X X
Lag1 X X X X X
Lag2 X X X X X
Lag3 X X X X X
Lag4 X X X X X
Lag5 X X X X X
Lag6 X X X X X
Lag7 X X X X X
Lag8 X X X X X
Lag9 X X X X X
Lag10 X X X X X
Where X represents the correlation coefficient between variable 1 and all the other variables (col B, C, D, E, F to Col GR) at different lags. The output doesn't have to be exactly like this though.
I can do this manually in excel using the Correl function, adjusting the series range in the formula every time I want to work out the lagged correlation.
Background: The correlation the function takes two arguments:
So to calculate correlations for Column A against all other columns I would input:
correl (A2:A100, B2:B100) to give the contemperous correlation between Col A and Col B, correl (A2:A100, c2:c100) to give correlation for col C vs. Col A etc.
Then to work out the correlation at 1 Lag I change the formula to (A3:A100, B2:B99), at lag 2 the formula becomes (A4:A100, B2:B98) etc. Then I do this for all the other columns/Lags but to do this for 200 variables, would take a huge amount of time (probably several days)! Is there a way of doing it using VBA perhaps? Even if it takes a few hours that's OK, as I can leave it to run in the background.
The way I'm thinking of doing this is setting up a workbook with the following sheets:
1) Source Sheet - this would contain all the raw data (Columns 1-200)as described in the top of the post.
2) Calculation sheet - Here I would have a duplicate of the above sheet but with one extra column (say column A) and so the dataset would end in column 201). Beneath this matrix I would set up formulas using the Excel