Mean squared error and Euclidean norm in machine learning

Question

Mean squared error and Euclidean norm in machine learning

3.8k Views Asked by Bumbble Comm At 10 May 2026 - 11:58

I was reading about linear regression and mean squared error in machine learning, and I came across this explanation:

Suppose that we have a design matrix of $m$ example inputs that we will not use for training, only for evaluating how well the model performs. We also have a vector of regression targets providing the correct value of $y$ for each of these examples. Because this dataset will only be used for evaluation, we call it the test set. We refer to the design matrix of inputs as $\mathbf{X}^{\text{(test)}}$ and the vector of regression targets as $\mathbf{y}^{\text{(test)}}$.

One way of measuring the performance of the model is to compute the mean squared error of the model on the test set. If $\hat{\mathbf{y}}^{\text{(test)}}$ gives the predictions of the model on the test set, then the mean squared error is given by

$$\text{MSE}_{\text{test}} = \dfrac{1}{m} \sum_{i} (\hat{\mathbf{y}}^{\text{(test)}} - \mathbf{y}^{\text{(test)}})_i^2.$$

Intuitively, one can see that this error measure decreases to $0$ when $\hat{\mathbf{y}}^{\text{(test)}} = \mathbf{y}^{\text{(test)}}$. We can also see that

$$\text{MSE}_{\text{test}} = \dfrac{1}{m} \vert\vert \hat{\mathbf{y}}^{\text{(test)}} - \mathbf{y}^{\text{(test)}} \vert\vert_2^2,$$

so the error increases whenever the Euclidean distance between the predictions and the targets increases.

I have two (related) areas of confusion here.

What is the $i$ iterating over in the sum?
For the latter equation, we have the $2$-norm (the Euclidean norm). But, unless I'm misunderstanding the notation here, we don't necessarily have that $\text{MSE}_{\text{test}} = \dfrac{1}{m} \sum_{i} (\hat{\mathbf{y}}^{\text{(test)}} - \mathbf{y}^{\text{(test)}})_i^2 = \dfrac{1}{m} \vert\vert \hat{\mathbf{y}}^{\text{(test)}} - \mathbf{y}^{\text{(test)}} \vert\vert_2^2$ for $i = 2$, right? Again, I think I might be confused about the notation here (specifically, for the first equation), so that might be where my confusion comes from. Can someone please clarify this?

Thank you.

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

$\mathbf{y}^{\text{(test)}}$, and $\hat{\mathbf{y}}^{\text{(test)}}$ are vectors of length $m$, and thus so is their difference. $i$ runs from $1$ to $m$ and is iterating over the entries of the vector $\hat{\mathbf{y}}^{\text{(test)}} - \mathbf{y}^{\text{(test)}}$.

The definition of the Euclidean norm for the vector $\hat{\mathbf{y}}^{\text{(test)}} - \mathbf{y}^{\text{(test)}}$ is $\sqrt{\sum_{i} (\hat{\mathbf{y}}^{\text{(test)}} - \mathbf{y}^{\text{(test)}})_i^2}$, and thus $\text{MSE}_{\text{test}} = \dfrac{1}{m} \sum_{i} (\hat{\mathbf{y}}^{\text{(test)}} - \mathbf{y}^{\text{(test)}})_i^2 = \dfrac{1}{m} \vert\vert \hat{\mathbf{y}}^{\text{(test)}} - \mathbf{y}^{\text{(test)}} \vert\vert_2^2$ actually does hold.

Mean squared error and Euclidean norm in machine learning

There are 1 best solutions below

Related Questions in NORMED-SPACES

Related Questions in REGRESSION

Related Questions in MACHINE-LEARNING

Related Questions in MEAN-SQUARE-ERROR

Trending Questions

Popular # Hahtags

Popular Questions