calculating gradient of generalized method of moment of GMM problem

137 Views Asked by At

i'm trying to get the gradient of a generalized method of moment that is based on the first and second moments of a Gaussian mixture model problem but i've failed.

a few notations:

K - amount of clusters,

N - dimension of samples,

T - amount of samples

A - N*K matrix of K cluste means.

lets assume that all gaussians have uniform diagonal noise (sigma*I) and that the priors of sample generated from the cluster (p) are known. lets notate {A,p}=θ

we sample T sample and denote them {y_i}i=1:T

the moments are : M_k=(1/N)∑y_i^k

will will try to apply GMM to the 1st and 2nd moments. the form of the problem is:

g(y,θ)=[g_1 (y,θ),g_2 (y,θ)]=[M_1-∑p_k* _k,M_2-(∑p_kA_kA_k^T+σ^2*I_M )]

note: p_k is element from p . A_k is column k of A. we sum over k cluster.

cost(A) = g(y,θ)' * W * g(y,θ) , were W is calculated as the optimal weighting according to GMM formulation.

i havent found any elegent way to calculate the N*K gradient of the scalar problem.

i couldnt match this problem to a simple form from The Matrix Cookbook. in addition i calculated grad(g_1 (y,θ)),grad(g_2 (y,θ)) by hand via generalizing small problems to a closed form solution.

i've tried to perform SVD to the W matrix and try to get a simple g_1' (y,θ)*W1 * g_1 (y,θ) + g_2' (y,θ)*W2 * g_2 (y,θ) form but failed as well...

  • a closed form of dx(f(x) * W * g(x)) where f(x),g(x) are vectors will be nice as well

what kind of a direction will lead me to a simple solution?

1

There are 1 best solutions below

0
On

so i found the solution to the problem: W = [W11 , W12; W21 , W22] g(y,θ)' * W * g(y,θ) = g_1' * W11 * g_1 + g_1' * W12 * g_2 + g_2' * W21 * g_1 + g_2' * W2 * g_2

  • grad(g_i' * Wii * g_i) = g_i' * (Wii+ Wii') * grad(g_i) (cookbook eq)
  • grad(g_i' * Wij * g_j) = grad(g_i' *( Wij * g_j)) = (grad of product) = grad(g_i)' *( Wij * g_j)) +Wij *grad( * g_j)' *( Wij * g_j))

all 4 elements can be derived by the 2 calculations above, and thats it.