I would like to fit a distribution $f(\cdot;\theta)$ to a sample $\{x_1,\dots,x_n\}$, obtaining a m.l.e. $\hat{\theta}$. I know that the random variable $X \sim f(\cdot;\theta)$ can be obtained as the result of generating a random variable $Y$ following a distribution with p.d.f. $g(\cdot;\theta)$ and then generating $X$ following a distribution with p.d.f. $h(\cdot;Y)$.
Is it the maximum likelihood estimate $\hat{\theta}$ obtained by maximizing $\theta$ in $$ \prod_{i=1}^n f(x_i; \theta) $$ equal to the maximum likelihood estimate $\hat{\theta}$ obtained by maximizing $(\theta, y_1, \dots, y_n)$ in $$ \prod_{i=1}^n g(y_i; \theta) h(x_i; y_i)? $$
(I didn't know whether to ask the question here or in Cross Validated. I included the question in both sites)
Since you don't observe $y_i$, you have this density for $X_i{:}$ $$ \int_{\large\mathscr Y} h(x_i;y) g(y;\theta) \, dy $$ (where $\mathscr Y$ is the space of possible $y$ values).
Therefore the likelihood function is $$ L(\theta) = \prod_{i=1}^n \int_{\large\mathscr Y} h(x_i;y) g(y;\theta) \, dy. $$