I was recently reading a stack exchange post on solving a polynomial regression problem with gradient descent, and one contributor identified the objective function as:
$$\text{argmin}_{\beta} \| \mathbf{y - X\beta}\|_2^2$$
They identified this as the "squared loss" i.e. L2 Loss function, which makes sense.
The notation in the equation, however, is new to me. Why are the 2's stacked atop each other in this equation? What do those 2's signify? What is the difference between the formulation above and:
$$\text{argmin}_{\beta} \| \mathbf{y - X\beta}\|^2$$
I'd be very grateful for any help others can offer with this question.
The subscript is indicating the type of norm. For instance, the finite dimensional Lp norms are given by
$$||\vec{x}||_p = \left(\sum_{i=1}^n |x_i|^p \right)^{1/p}$$
The superscript is simply the square. So, the L2 norm is then
$$||\vec{x}||_2 = \left(\sum_{i=1}^n |x_i|^2 \right)^{1/2}$$
$$||\vec{x}||_2^2 = \sum_{i=1}^n x_i^2.$$
The main idea here is that there are many types of norms, and we often want to specify which one it is that we're using.