Given a set of points $(x_i, y_i)$, least-squares linear regression finds the linear function $L$ such that $$\sum \varepsilon(y_i, L(x_i))$$ is minimized, where $\varepsilon(y, y') = (y-y')^2$ is the squared error between the actual $y_i$ value and the one predicted by $L$ from $x_i$.
Suppose I want to do the same, but I want to use the following penalty function in place of $\varepsilon$: $$ \epsilon(y, L(x)) = \cases{y-L(x) & \text{if $y>L(x)+1$} \\ (y-L(x))^2 & \text{if $y\le L(x)+1$}}$$
If the fitted line passes a certain distance below the actual data point, I want to penalize it much less severely than if the line passes the same distance above the data point. (The threshhold for using the cheaper penalty function is $y > L(x)+1$ rather than $y > L(x)$ so that overshooting by some amount $\delta<1$ is not penalized less than undershooting by the same amount—we don't want to overpenalize $L$ for undershooting by too little!)
I could probably write a computer program to solve this numerically, by using a hill-climbing algorithm or something similar. But I would like to know if there are any analytic approaches. If I try to follow the approach that works for the usual penalty function, I get stuck early on, because there seems to be no way to expand the expression $\epsilon(y_i, mx_i+b)$ algebraically.
I expect that the problem has been studied with different penalty functions, and I would be glad for a reference to a good textbook.
In the strict sense of the term, least squares methods deal with situations where $$\epsilon(y,L(x))= (y-L(x))^2$$
Such errors lend themselves to easy modeling, via the normal equations. Others error metrics don't have such closed form solutions. They are usually iterative in nature.
The topic that you are looking for is its general method, the penalty function method. Just google it and you will find lots of references. Here is a nice place to start with. Let me know if you want anything specific.