Linear regression where undershooting isn't as bad as overshooting

Question

Linear regression where undershooting isn't as bad as overshooting

644 Views Asked by Bumbble Comm At 05 Apr 2026 - 7:37

Given a set of points $(x_i, y_i)$, least-squares linear regression finds the linear function $L$ such that $$\sum \varepsilon(y_i, L(x_i))$$ is minimized, where $\varepsilon(y, y') = (y-y')^2$ is the squared error between the actual $y_i$ value and the one predicted by $L$ from $x_i$.

Suppose I want to do the same, but I want to use the following penalty function in place of $\varepsilon$: $$ \epsilon(y, L(x)) = \cases{y-L(x) & \text{if $y>L(x)+1$} \\ (y-L(x))^2 & \text{if $y\le L(x)+1$}}$$

If the fitted line passes a certain distance below the actual data point, I want to penalize it much less severely than if the line passes the same distance above the data point. (The threshhold for using the cheaper penalty function is $y > L(x)+1$ rather than $y > L(x)$ so that overshooting by some amount $\delta<1$ is not penalized less than undershooting by the same amount—we don't want to overpenalize $L$ for undershooting by too little!)

I could probably write a computer program to solve this numerically, by using a hill-climbing algorithm or something similar. But I would like to know if there are any analytic approaches. If I try to follow the approach that works for the usual penalty function, I get stuck early on, because there seems to be no way to expand the expression $\epsilon(y_i, mx_i+b)$ algebraically.

I expect that the problem has been studied with different penalty functions, and I would be glad for a reference to a good textbook.

Original Q&A

There are 2 best solutions below

user856 On 10 Nov 2012 - 4:09

Your penalty function is not convex. I suggest $$\begin{align} \epsilon(r) &= \begin{cases}r & \text{if } r > 1 \\ \frac12 (r^2 + 1) & \text{if } r \le 1\end{cases} \\ &= r + \frac12\min(r-1,0)^2, \end{align}$$ where $r=y-L(x)$. This is continuous, differentiable, convex, has minimum at $0$, and has a simple algebraic-ish form.

Nevertheless, I doubt there are any analytic solutions when piecewise polynomial functions are involved. One can observe that the net objective $\sum \epsilon(y_i-L(x_i))$ is piecewise quadratic, and the boundaries of the pieces are linear. So you could do quadratic programming over the current piece; if the solution lies on a boundary, switch to the adjacent piece; and repeat. I don't know if this will be any faster than doing a general nonlinear optimization, though, especially since the function seems nice enough (convex, differentiable, and hopefully not too badly scaled).

**Bumbble Comm** · Accepted Answer

In the strict sense of the term, least squares methods deal with situations where $$\epsilon(y,L(x))= (y-L(x))^2$$

Such errors lend themselves to easy modeling, via the normal equations. Others error metrics don't have such closed form solutions. They are usually iterative in nature.

The topic that you are looking for is its general method, the penalty function method. Just google it and you will find lots of references. Here is a nice place to start with. Let me know if you want anything specific.

Linear regression where undershooting isn't as bad as overshooting

There are 2 best solutions below

Related Questions in REFERENCE-REQUEST

Related Questions in NUMERICAL-METHODS

Related Questions in APPROXIMATION

Related Questions in REGRESSION

Trending Questions

Popular # Hahtags

Popular Questions