I am new to covariance selection, and trying to fit a set of high-dimensional data with undirected Gaussian graphical model. The graph structure of multivariate normal distribution has been given. I follow the covariance selection procedures introduced in "The Elements of Statistical Learning" (p634). It works well with low-dimensional data. But when I use it for high-dimensional data problems (24-dimensional vectors), it sometimes does not converge. The algorithm is easy to understand and implement, and I don't know what went wrong. My questions are as follows:
Why the algorithm does not converge with high-dimensional data. Is there any way to improve the algorithm?
Are there any other covariance selection algorithms that are easy to implement?
Thanks in advance.
This is the matlab code that I followed: https://github.com/probml/pmtk3/blob/master/toolbox/GraphicalModels/ggm/sub/ggmFitHtf.m
and this is the code I implemented in python:
import numpy as np
import numpy.matlib
def ggmfit(S, G, maxIter):
'''
MLE for a precision matrix given known zeros in the graph,
S is empirical covariance matrix, numpy matrix,
G is graph structure, numpy matrix,
Hastie, Tibshirani & Friedman ("Elements" book, 2nd Ed, 2008, p634)
'''
S = np.matrix(S)
G = np.matrix(G)
convengenceFlag = False
p = S.shape[0]
W = S
theta = np.matlib.zeros((p, p), dtype=W.dtype) # precision matrix
for i in range(maxIter):
normW = np.linalg.norm(W)
for j in range(p):
notj = list(range(p))
notj.pop(j)
W11 = W[notj][:, notj]
S12 = S[notj][:, j]
S22 = S[j, j]
# non-zero
notzero = ~(G[j][:, notj] == 0)
notzero = np.squeeze(np.asarray(notzero), axis=0)
S12_nz = S12[notzero]
W11_nz = W11[notzero][:, notzero]
beta = np.matlib.zeros((p-1, 1), dtype=W.dtype)
beta[notzero] = W11_nz.I * S12_nz
# W12 = W11 * beta
W12 = W11 * beta
W[notj][:, j] = W12
W[j][:, notj] = W12.T
if (i == (maxIter - 1)) or convengenceFlag:
theta22 = 1 / (S22 - W12.T * beta)
assert theta22 >= 0
theta12 = - beta * theta22
theta[j, j] = theta22
theta[notj][:, j] = theta12
theta[j][:, notj] = theta12.T
if convengenceFlag:
break
normW_ = np.linalg.norm(W)
delta = np.abs(normW_ - normW)
if i % 1000 == 0:
print("{:.6f}\n".format(delta))
if delta < 10e-6:
convengenceFlag = True
W = (W + W.T) / 2
theta = (theta + theta.T) / 2
return W, theta, (i, delta)
The problem was solved by myself. There was a bug in my python code:
W was't actually assigned by W12. I don't quite understand why this happens. After I modified the code, the algorithm worked well.