I have a training data matrix $S_{\tau \times n}$ and actual output $y_{1\times P}$. The weighted parameters for a linear model that maps the input to the output is
$$ y = \alpha_{1 \times \tau}S_{\tau \times n}\beta_{n\times p}$$
Now my question is, is it possible to group this formula into the standard notation?
$$y=Ax$$
If the answer is no, what method can be used to find the optimal parameters of $\alpha$ and $\beta$ that minimizes the mean square error of the predicted $\hat{y} $ and the actual output $y$?