Approximate arbitrary distribution with a Gaussian Mixture

511 Views Asked by At

I saw the claim that a Gaussian mixture is a universal approximator of densities here. I am trying to implement that to approximate an arbitrary distribution, for example, a univariate bounded uniform distribution $pdf=1/(b-a): [a,b]$ and $pdf=0:$ else.

The only way I thought about doing this is with a nonlinear optimization program (like fminsearch in Matlab) on the squared error from the true distribution ($L_2$-norm). The problem is that it highly depends on the initialization of the parameters ($\gamma_i, \mu_i, \sigma_i$) and I get various results. I'll attach the code here at the bottom, although, it is not really the issue. I'd like to know if there's a better, or maybe some analytic way to approximate a distribution. For example, how is the claim "can be approximated with any specific nonzero amount of error" reconciled here when I can't control the actual error? (I know my example isn't smooth).

Thank you.

% just an example
params0 = [1/3, 1, 1 , ...
           1/3, 1, 1, ...
           1/3, 1., 1];
[params,fval] = fminsearch(@distrib_gaussian_mixture,params0);

% the function to call, some "tricks" are here to avoid the need for a constrained program
function dist = distrib_gaussian_mixture(params)
    k = size(params,1);
    x = -3:0.01:3;
    pdf = 0*x;
    true_pdf = pdf;
    true_pdf(350:450)=1;
    true_pdf = true_pdf / norm(true_pdf);
    
    for i=1:k
        gamma = abs(params(i,1));
        mu = params(i,2);
        sigma = abs(params(i,3));
        pdf = pdf + gamma * normpdf(x, mu, sigma);
    end
    pdf = pdf / norm(pdf);
    
    dist = norm(pdf-true_pdf);

EDIT:

I found that Matlab has a function called fitgmdist which basically does what I want. The problem is that it is making it in a weird way - instead of approximating the pdf, I need to create random samples from the pdf, and it approximates from that data (not very convenient but more importantly it will vary the results a little bit).

I would close my question, though I do not really have an answer for how to guarantee the error bounds (other than iterating the number of mixands until I get it).