I'm attempting to derive the maximum likelihood estimates for the parameters of the bi-variate normal distribution model of linear regression and I am well and truly stuck. Just looking for some guidance as how to approach this question.
What I know:
$(x_i,y_i)$ are the outcomes of a bi-variate normal distribution $(X_i,Y_i) \space\forall i \in[1,n]$, s.t.
$X_i \sim N(\mu_X,\sigma^2_X)$,
$(Y_i|X_i =x_i) \sim N(\mu_Y -\mu_X \frac{\sigma_Y}{\sigma_X}\rho +\frac{\sigma_Y}{\sigma_X}\rho x_i,\sigma^2_Y(1-p^2)) $
And the MLEs are:
$\hat{\mu_X}=\bar X$
$\hat{\mu_Y} = \bar Y$
$\hat{\sigma^2_X} = \frac{1}{n}S_{XX}$
$\hat{\sigma^2_Y} = \frac{1}{n}S_{YY}$
$\hat\rho = R = \frac{S_{XY}}{\sqrt{S_{XX}S_{YY}}}$
What I attempted
So obviously I first have to develop a likelihood function to optimise. I attempted to use the normal distribution for Y|X, using the mean and variance as described by the distribution above and then take the partial derivatives with respect to $\mu_X,\mu_Y,\sigma_X,\sigma_Y$ and $\rho$ respectfully, setting them all to $0$ and solving for each parameter.
So, $L(\mu_X,\mu_Y,\sigma_X,\sigma_Y,\rho;x_i,y_i)=[\sigma_Y\sqrt{(1-\rho^2)2\pi}\space]^{-1}exp(\frac{[y_i-(\mu_Y -\frac{\sigma_Y}{\sigma_X}\rho\mu_X+\frac{\sigma_Y}{\sigma_X}\rho x_i)]^2}{-[2\sigma_Y^2(1-\rho^2)]})$
But all I really got out of that was that
$(y_i-\mu_Y) = \frac{\sigma_Y}{\sigma_X}\rho (x_i-\mu_X)$,
from the derivatives of the first three parameters, but given how complex the last two were I didn't even go there as I'm sure I'm doing something wrong.
So if someone could give me a bit of a hand or at least point me in the right direction it would be much appreciated.
Edit
Ok, so I ended up solving most of my question with the Method of Moments, which doesn't necessarily give MLEs, but does so in this case.
So I rearranged the form of the random variables X and Y to the following:
$X = \mu_X + \sigma_XU, \space \space Y = \mu_Y + \sigma_Y\rho U + \sigma_Y\sqrt{1-\rho^2}V$
Where, $U,V\overset{IID}{\sim} N(0,1)$
I then equated the moments by: $\mathbb{E}(X^k)=\frac{1}{n}\sum_{i=1}^{n}X_i^k$, for both X and Y.
This gave me the estimators:
$\hat{\mu_X}=\bar X$
$\hat{\mu_Y}=\bar Y$
$\hat{\sigma_X}=\frac{1}{n}S_{XX}$
$\hat{\sigma_Y}=\frac{1}{n}S_{YY}$
But I'm still not sure how to get the estimator for $\rho$.
You should sum it for all the $n$ observations that you got, i.e., $$ \sum(Y_i-\mu_Y) = \rho\frac{\sigma_Y}{\sigma_X}\sum(X_i-\mu_X), $$ thus $$ n\bar{Y}_n-n\mu_Y=\rho\frac{\sigma_Y}{\sigma_X}(n\bar{X}_n-n\mu_X), $$ now plug in the MLE estimators for all parameters except $\mu_X$, in particular $\hat{\mu}_Y=\bar{Y}_n$, thus you'll have $$ 0=\hat{\rho}\frac{S_Y}{S_X}(n\bar{X}_n - n\mu_X), $$ hence, $\hat{\mu}_X=\bar{X}_n$. You should use the same logic for the other parameters, i.e., plug in the MLE and see what cancels out.