I am trying to fit basic Gaussian mixture with a Bayesian model.
My likelihood function is Gaussian, with std=1, and the only parameter is the mean, chosen from $\{0,1,\dots,14,15\}$ and my prior is uniform.
I would expect that if the data is sampled from 2 distinct Gaussians, then the posterior distribution should converge to those two Gaussian means.
Where in fact, the Bayesian model converged to one Gaussian model with the mean of the two means of the two Gaussians.
Here is my code:
import bayes
import numpy as np
from scipy.stats import norm
mixture_means=[5,10]
sample_size=30
mixture=list(np.random.normal(mixture_means[0],size=sample_size/2))+list(np.random.normal(mixture_means[1],size=sample_size/2))
model=bayes.DiscreteBayes()
vals=range(16)
def norm_likelihood(mean,std=1.0):
'''Returns normal likelihood function with given mean and std'''
ret=map(norm.pdf,[(v-mean)/std for v in vals])
s=sum(ret)
ret=map(lambda x:x/s,ret)
return lambda i: ret[int(round(i))] if int(round(i)) in vals else 0
for v in vals:
model.hypo[v]=bayes.hypothesis(likelihood=norm_likelihood(v),belief=1.0)
model.normalize()
model.update(mixture)
posterior=[m.belief for m in model.hypo.values()]
The content of the bayes module can be found here
The resulting posterior distribution is
{0: 0.0,
1: 1.7310930899121284e-247,
2: 2.1969933450785758e-174,
3: 1.2166851528859524e-113,
4: 7.1750915476302883e-66,
5: 3.9752699764577839e-31,
6: 2.0610600467738752e-09,
7: 0.99995460007033121,
8: 4.5397868608866683e-05,
9: 1.9286622833850024e-22,
10: 7.6673013627085034e-53,
11: 2.8524215798147219e-96,
12: 9.9695335021693221e-153,
13: 3.7105281612774415e-222,
14: 0.0,
15: 0.0}
Where the actual 2 means are $5$ and $10$, and for some reason the posterior converged to $7$.
Why is that ?
The belief for gaussian $\mu=5$ and $\mu=10$ should be strongest, since the data is actually being sampled from those distribution.
Instead, we see that the belief for those two means is negligible