I'm training a multivariate linear regression model using sklearn. All good. However, I'm struggling to understand the math underneath it when using training examples with several features. Concretely, I'd like to know if there is a general polynomial equation that resolves for any feature size vectors. The first equation on this link shows the generalized equation when x is a single number. How do I extrapolate this when x is, for instance, a 3-element vector?
This is a practical example of what I'm trying to understand.
What you're looking for is a general linear model, which includes special cases like simple and multiple linear regression, ANOVA models, polynomial regression, etc. The general form is $\mathbf{Y} = \mathbf{X}\boldsymbol\beta + \boldsymbol{\varepsilon}$, where $\mathbf{Y}$ is the vector of responses, $\mathbf{X}$ is the design matrix (containing all the terms without the coefficients), $\boldsymbol\beta$ is the vector containing the coefficients, and $\boldsymbol\varepsilon$ is the vector of random errors satisfying $E(\boldsymbol\varepsilon) = \mathbf{0}, Var(\boldsymbol\varepsilon) =\sigma^2 \mathbf{I}$.
Written out in full, your RGB model would look like something like this:
$ \begin{bmatrix} y_1\\ y_2\\ y_3 \\ \vdots \\ y_n \end{bmatrix}= \begin{bmatrix} 1 & r_1 & g_1 & b_1 & r_1^2 & g_1^2 & b_1^2 & r_1g_1 & r_1b_1 & g_1b_1 \\ 1 & r_2 & g_2 & b_2 & r_2^2 & g_2^2 & b_2^2 & r_2g_2 & r_2b_2 & g_2b_2 \\ 1 & r_3 & g_3 & b_3 & r_3^2 & g_3^2 & b_3^2 & r_3g_3 & r_3b_3 & g_3b_3 \\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\ 1 & r_n & g_n & b_n & r_n^2 & g_n^2 & b_n^2 & r_ng_n & r_nb_n & g_nb_n \end{bmatrix} \begin{bmatrix} \beta_0\\ \beta_1\\ \beta_2\\ \beta_3 \\ \beta_4 \\ \beta_5 \\ \beta_6 \\ \beta_7 \\ \beta_8 \\ \beta_9 \\ \beta_{10} \end{bmatrix} + \begin{bmatrix} \varepsilon_1\\ \varepsilon_2\\ \varepsilon_3 \\ \vdots \\ \varepsilon_n \end{bmatrix} $