MAP estimation for discriminative models

67 Views Asked by Bumbble Comm At 08 Apr 2026 - 12:49

I have some problems in understanding the MAP estimation for discriminative models. I will use the notation used in the very first two pages of this paper https://www.microsoft.com/en-us/research/wp-content/uploads/2016/05/Bishop-Valencia-07.pdf As far as I understand, the posterior distribution of a discriminative model is $p(\theta|X, C)$. Where $X = \{x_1,x_2,\dots,x_n\}$ is the training set while $C=\{c_1,c_2,\dots,c_n\}$ are the corresponding labels.

As usual the posterior is split into the prior and likelihood.

$$p(\theta|X,C) \overset{?}{=} \frac{p(\theta)L(\theta)}{p(C|X)} = \frac{p(\theta)p(C|X,\theta)}{p(C|X)}$$

The step that I do not understand is the one marked by the "?".

Moreover, in the same paper, it is also noted that: $p(\theta,C|X)=p(\theta)L(\theta)$. However, if we go a bit further: \begin{aligned} p(\theta,C|X)=p(\theta)L(\theta) \implies \\ p(\theta,C|X)=p(\theta)p(C|X,\theta) \implies \\ \frac{p(\theta,C,X)}{p(X)} = \frac{p(\theta)p(\theta,C,X)}{p(X,\theta)} \implies \\ p(X)p(\theta) = p(X,\theta) \end{aligned}

Therefore, it seems that $X$ and $\theta$ are independent but I fail to see why. Given such independence, it would be fairly simple also proving the step marked with "?".

Original Q&A

MAP estimation for discriminative models

Related Questions in STATISTICS

Related Questions in STATISTICAL-INFERENCE

Related Questions in MACHINE-LEARNING

Trending Questions

Popular # Hahtags

Popular Questions