I am new to machine learning. I am currently studying various kinds of regressions using the scikit-learn Python library. Here's an example from the documentation: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression There are a lot of mathematical formulas here and I don't understand many of the notation.
For example : There is some function :
y(w,x) = w0 + w1*x1 + ... + wp * xp
and
vector w = (w1,....wp)
In that case linear regression looks like:
What does it mean "||" and two "2" in bottom and top? I guess "2" on top means square.
Could anyone explain on the fingers or give a link to the source, where the symbols are explained. Thanks.

The $\min$ means you want to minimize the function with respect to a variable, in this case $w$. So you want to find out which value does $w$ has to have so that your function is minimized. In this case, $w$ is a vector and you want to find the vector which minimizes the function.
The ||$.$||$_2$ is the norm and the subscript, in your case $2$, defines which norm. Your function calculates the L2-norm, which can be calculated as: $\sqrt{x_1^2+x_2^2+...+x_n^2}$.
$X$ is a matrix of size $n \times m$. $Xw$ represents the matrix-vector product between the matrix $X$ and the vector $w$. https://mathinsight.org/matrix_vector_multiplication. This gives a vector back.
This resulting vector is subtracted with the vector $y$, which is your target vector (the thing you want to predict).
The $2$ at the upper right side of ||.|| is simply the square. That means that the equation in point two becomes $x_1^2+x_2^2+...+x_n^2$. So the error you got from the previous point (which is again a vector) is then put into this calculation.
Basically you have your input matrix $X$ which you want to adjust with a fixed vector of coefficients $w$ of length $n$. To find this vector, you execute the non-linear least squares problem, described by the steps above and represented mathematically by the function in your question.
Then you have something similar with $\alpha||w||_2^2$. This is the regularisation part of the function and prevents a variable to massively dominate the NNLS solution.
So you chose a value for $\alpha$. This is done outside the model, so we usually call this a "hyperparameter". Now, $||w||_2^2$ is calculated again as: $w_1^2+w_2^2+...+w_n^2$. This means that when a value in the vector $w$ would becomes too large when minimizing the first part, it is "regularized" by this part, because a too large value in the vector $w$ would let $\alpha||w||_2^2$ become very high.
In Machine learning, you will be dealing with numerical optimization and linear algebra all the time, you I suggest you become familiar with these fields asap. It will take some time, but once you got the hang of it, you can read functions like this and variations of it quite easy.
Good luck!