I am pretty new to mulltivariate time series, I am trying to make a VAR model with 108 predictors and 1 target variable. While performing the Johansen Cointegration Test, I am getting an error
LinAlgError: Matrix is not positive definite
My code is :
'''python
def cointegration_test(df, alpha=0.05):
"""Perform Johanson's Cointegration Test and Report Summary"""
out = coint_johansen(df,-1,5)
d = {'0.90':0, '0.95':1, '0.99':2}
traces = out.lr1
cvts = out.cvt[:, d[str(1-alpha)]]
def adjust(val, length= 6): return str(val).ljust(length)
# Summary
print('Name :: Test Stat > C(95%) => Signif \n', '--'*20)
for col, trace, cvt in zip(df.columns, traces, cvts):
print(adjust(col), ':: ', adjust(round(trace,2), 9), ">", adjust(cvt, 8), ' => ' , trace > cvt)
cointegration_test(g)
'''
Where g is my Time Series Dataframe of shape (48 rows × 109 columns) . rows are the date-time index and columns are the predictors/variables.
Data in few columns ranging from 0-1(For ex : Consumer Price Index) and others in the range of millions(For ex : Population, GDP).
There are columns in the dataframe which contain negative terms as well(For eg : Change in Employment)
Few columns also contain zero in them
But when i pass the dataframe after making all the column stationary using
g = g.diff().dropna().diff().dropna()
and then pass the differenced dataframe to cointegration_test its giving error as :
LinAlgError: Matrix is not positive definite
As far as my understanding goes Matrix is not positive definite means that the Eigen values associated with it are non-positive.And Eigen-values are only possible for square matrix, but given the data which I am feeding is a non-square...
How can I solve this problem? Where should I look next? Would appreciate any help.
Thanks