Let $(\Omega,\Sigma,P)$ be some probability space and let $Y: \Omega \to \mathbb{R} $ be a random variable. Let $x, \beta \in \mathbb{R}^n$. In a linear model we assume something like this: $E(Y|x)=\beta^Tx$. I know conditional expectation, where you condition on either an event $E \in \Sigma$ or on a sub-$\sigma$-algebra $\Sigma' \subseteq \Sigma$. As a special case of the latter, we can condition on another random variable. But the variable $x$ is not random, so how would you rigorously define $E(Y|x)$?
Is there a rigorous probability theoretic formulation of linear regression
185 Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail AtThere are 2 best solutions below
On
There are several models that can be called "linear regression."
The most straightforward model is to assume we have $p$ $n$-vectors $X = [x_{(1)}, \ldots, x_{(p)}]$ and a "response" $y$ in $\mathbf{R}^n$ both of which are known, and we want to obtain the projection of $y$ onto the span of $X.$ There are no probability assumptions here.
A second model is to assume $(Y,X)$ is multinormal and then $E(Y \mid X)$ is linear. Here we explicitly assume $(Y,X)$ follow a particular probability distribution.
The third is to observe that for $L^2$ random variables, $Y = E(Y \mid \mathscr{H}) + (Y - E(Y \mid \mathscr{H}))$ and we see that $E(Y \mid \mathscr{H})$ is the orthogonal projection of $Y$ onto $L^2(\mathscr{H})$; when $\mathscr{H} = \sigma(X),$ then $E(Y \mid \mathscr{H}) = E(Y \mid X) = f(X)$ for some (deterministic but unknown) measurable function $f$; often we approximate linearly $f(x) \approx \beta^\intercal x$ and then we recover Ordinary Linear Regression. Here, we only need that $Y$ has second moment.
The typical way linear models work is to assume that $Y = \beta^T X + \varepsilon$ where $X$ is some random variable that we believe is related to $Y$ and $\varepsilon$ is another random variable independent of $X$, and $\beta$ is constant (i.e. deterministic), but possibly unknown. For example, $Y$ might be a person's weight and $X$ might be a person's height.
There is a theorem that states that (under certain conditions) there exists a measurable, deterministic function $f$ such that $\mathbb{E}[Y|X] = f(X)$. One can then define $\mathbb{E}[Y|x] = f(x)$ to avoid conditioning on null events like $X = x$ when $X$ is a continuous random variable.