Relationship between eigenvalues and positive semi-definite matrices?

17 Views Asked by At

I've been trying to write a function (python) to sample covariance matrices. Not sample from them, but to sample the matrices themselves. What I've found is that the positive semi-definite constraint makes sampling somewhat challenging.

By intuition, $cov(a,b) \le min[ var(a), var(b)]$. Great but this constraint alone doesn't quite meet the positive semidefinite requirement.

def cov_sampler(n, max_var=3):
    # 1. Sample variances randomly
    variances = np.random.uniform(low=0.01, high=max_var, size=n)
    
    # 2. Initialize cov matrix
    covariances = np.zeros([n, n])
    
    # 3.0. Assign variance or covariance values respectively
    for i in range(n):
        for j in range(i,n):

            var_i = variances[i]
            var_j = variances[j]  

            # 3.1. assign variance along diagonal          
            if i==j:
                covariances[i,j] = var_i
                covariances[j,i] = var_i                
                continue 

            # 3.2. ensure cov(a,b) <= min[ var(a), var(b) ]
            else:
                max_cov = min(var_i, var_j) 
                cov = np.random.uniform(low=-max_cov, high=max_cov)
                covariances[i,j] = cov
                covariances[j,i] = cov
    
    # 4.0 If min eigenvalue <= 0, subtract from all elements on diagonal
    eigenvalues = np.linalg.eig(covariances)[0]
    min_ev = np.min(eigenvalues) 
    diagonal = np.identity(n=n) * min_ev * int(min_ev < 0)   

    # 5.0 exit
    return covariances - diagonal

My question is- why is the eigen dance necessary?

Intuitively I can reason that if $corr(a,b)$ is strong positive and $corr(b,c)$ is strong positive, then we might know something about $corr(a,c)$.

But why would removing negative eigen values from the diagonal be beneficial?