I saw the claim that a Gaussian mixture is a universal approximator of densities here. I am trying to implement that to approximate an arbitrary distribution, for example, a univariate bounded uniform distribution $pdf=1/(b-a): [a,b]$ and $pdf=0:$ else.
The only way I thought about doing this is with a nonlinear optimization program (like fminsearch in Matlab) on the squared error from the true distribution ($L_2$-norm). The problem is that it highly depends on the initialization of the parameters ($\gamma_i, \mu_i, \sigma_i$) and I get various results. I'll attach the code here at the bottom, although, it is not really the issue. I'd like to know if there's a better, or maybe some analytic way to approximate a distribution. For example, how is the claim "can be approximated with any specific nonzero amount of error" reconciled here when I can't control the actual error? (I know my example isn't smooth).
Thank you.
% just an example
params0 = [1/3, 1, 1 , ...
1/3, 1, 1, ...
1/3, 1., 1];
[params,fval] = fminsearch(@distrib_gaussian_mixture,params0);
% the function to call, some "tricks" are here to avoid the need for a constrained program
function dist = distrib_gaussian_mixture(params)
k = size(params,1);
x = -3:0.01:3;
pdf = 0*x;
true_pdf = pdf;
true_pdf(350:450)=1;
true_pdf = true_pdf / norm(true_pdf);
for i=1:k
gamma = abs(params(i,1));
mu = params(i,2);
sigma = abs(params(i,3));
pdf = pdf + gamma * normpdf(x, mu, sigma);
end
pdf = pdf / norm(pdf);
dist = norm(pdf-true_pdf);
EDIT:
I found that Matlab has a function called fitgmdist which basically does what I want. The problem is that it is making it in a weird way - instead of approximating the pdf, I need to create random samples from the pdf, and it approximates from that data (not very convenient but more importantly it will vary the results a little bit).
I would close my question, though I do not really have an answer for how to guarantee the error bounds (other than iterating the number of mixands until I get it).