How to prove that likelihood p(Y | w) = pE(Y − Xw) (pE is pdf of error E)?

113 Views Asked by At

According to 'Applied Data Science' (Langmore and Krasner), if we have linear regression equation Y = Xw + E then likelihood p(Y | w) = pE(Y − Xw) (pE is pdf of error E)

1

There are 1 best solutions below

1
On BEST ANSWER

It appears that you are asking about the probability density of $Y$ for a fixed value of $w.$ Conventionally one distinguished between the notation for a random variable and that of the argument to its density function or to its c.d.f., often by using a capital letter for the random variable and a lower-case letter for the argument. Note that without this distinction, an expression like $\Pr(Y\le y)$ could not be understood, and there are yet other difficulties besides that. Hence we seek $p_Y(y).$ The linear regression model say that $Y = Xw+\varepsilon,$ where typically $Y$ is an $n\times1$ column vector, $\varepsilon$ is an $n\times 1$ column vector of independent random variables with expected value $0$ and equal variances, $X$ is an $n\times p$ matrix, and $w$ is a $p\times 1$ matrix. The values of $X$ and $Y$ are observed; the values of $\varepsilon$ and $w$ are not, and one estimates $w$ by using least-squares.

Suppose the $n$ entries in $\varepsilon$ are identically distributed and let $f$ be their common density. Then the probability that $\varepsilon$ falls in some specified set $A$ is $$ \int_A f(x) \, dx. $$ The probability that $Y$ falls in any specified set $B$ is the probability that $Xw+\varepsilon$ is in $B$. The event that $Xw+\varepsilon$ is in $B$ is the same as the event that $\varepsilon$ is in the set $B-Xw = \{ b-Xw : b\in B \}.$ Thus that probability is $$ \int_{B\,-\,Xw} f(x) \, dx. $$ This is the same as $$ \int_B f(y - Xw)\, dy $$ with the same function $f,$ if $y = Xw+x.$ The change of variables does not necessitate multiplying $dy$ by anything to get $dx,$ since the change of variables consists only of addition of a constant (and "constant" in this context means not depending on $x$ or $y$). And $x\in B-Xw$ if and only if $y\in B.$