How does cost function with abs value(= no gradient direction) works?

377 Views Asked by At

0 down vote favorite I am a novice in the topic and as for me, if we are optimizing some function with gradient descent we need to know how much we missed(gradient magnitude) and the direction of gradient + or - sign.

Let say we have y = w*x. And we have a weight W = 4; x = 3; So we got y = 4*3 = 12. For example, the target value y in x = 3 point is 6. So if we have cost function without ABS we will have gradient = 6 - 12 = -6. And so we can say that should adjust W by -6/3 = -2. So target W will be 4 - 2 = 2. Bingo.

From the other side if we will use some abs cost function like square loss we will not have a direction of miss. We will not know are we overshoot or undershoot, should we reduce W in this example or make it bigger. Can you please explain how gradient optimization works for this case?