I am trying to solve a multivariate optimization problem (actually trying to minimize a first order objective function) using gradient descent. The objective function is simple:
f(x, y) = 2*x + 3*y - 10
With each iteration (with a step_size of 0.001), I am updating the x, y in a very classical text-book way:
x = x - (step_size * df/dx)
y = y - (step_size * df/dy)
and with each iteration, when I calculate the value of objective function with updated (x, y), it seems that the solution is running away negatively (with x and y being decreased negatively) from the 0 or root of the function.
What is this property? Is gradient descent thinking that its actually decreasing the objective function by decreasing the values of (x, y), although the value of objective function also keeps decreasing into negative space.
Here is a trace of f(x,y) and (x,y) (50 iterations), with (x,y) being initialized at (1,1). If you notice, f(x,y) is decreasing (and running away from solution 0) along with x and y being decreased too..:
> -5.012999999999153 [ 0.9979999999997204, 0.9970000000004688 ]
> -5.025999999998305 [ 0.9959999999994409, 0.9940000000009377 ]
> -5.038999999997458 [ 0.9939999999991613, 0.9910000000014065 ]
> -5.05199999999661 [ 0.9919999999988818, 0.9880000000018754 ]
> -5.064999999995763 [ 0.9899999999986022, 0.9850000000023442 ]
> -5.0779999999949155 [ 0.9879999999983227, 0.9820000000028131 ]
> -5.090999999994068 [ 0.9859999999980431, 0.9790000000032819 ]
> -5.103999999993221 [ 0.9839999999977636, 0.9760000000037508 ]
> -5.116999999992373 [ 0.981999999997484, 0.9730000000042196 ]
> -5.129999999991526 [ 0.9799999999972044, 0.9700000000046884 ]
> -5.142999999990678 [ 0.9779999999969249, 0.9670000000051573 ]
> -5.155999999989831 [ 0.9759999999966453, 0.9640000000056261 ]
> -5.1689999999889835 [ 0.9739999999963658, 0.961000000006095 ]
> -5.181999999988136 [ 0.9719999999960862, 0.9580000000065638 ]
> -5.194999999987289 [ 0.9699999999958067, 0.9550000000070327 ]
> -5.207999999986441 [ 0.9679999999955271, 0.9520000000075015 ]
> -5.220999999985594 [ 0.9659999999952475, 0.9490000000079704 ]
> -5.233999999984746 [ 0.963999999994968, 0.9460000000084392 ]
> -5.246999999983899 [ 0.9619999999946884, 0.943000000008908 ]
> -5.259999999983052 [ 0.9599999999944089, 0.9400000000093769 ]
> -5.272999999982204 [ 0.9579999999941293, 0.9370000000098457 ]
> -5.285999999981357 [ 0.9559999999938498, 0.9340000000103146 ]
> -5.298999999980509 [ 0.9539999999935702, 0.9310000000107834 ]
> -5.311999999979662 [ 0.9519999999932907, 0.9280000000112523 ]
> -5.3249999999788145 [ 0.9499999999930111, 0.9250000000117211 ]
> -5.337999999977967 [ 0.9479999999927315, 0.92200000001219 ]
> -5.35099999997712 [ 0.945999999992452, 0.9190000000126588 ]
> -5.363999999976272 [ 0.9439999999921724, 0.9160000000131276 ]
> -5.376999999975425 [ 0.9419999999918929, 0.9130000000135965 ]
> -5.389999999974577 [ 0.9399999999916133, 0.9100000000140653 ]
> -5.40299999997373 [ 0.9379999999913338, 0.9070000000145342 ]
> -5.4159999999728825 [ 0.9359999999910542, 0.904000000015003 ]
> -5.428999999972035 [ 0.9339999999907747, 0.9010000000154719 ]
> -5.441999999971188 [ 0.9319999999904951, 0.8980000000159407 ]
> -5.45499999997034 [ 0.9299999999902155, 0.8950000000164096 ]
> -5.467999999969493 [ 0.927999999989936, 0.8920000000168784 ]
> -5.480999999968645 [ 0.9259999999896564, 0.8890000000173472 ]
> -5.493999999967798 [ 0.9239999999893769, 0.8860000000178161 ]
> -5.5069999999669506 [ 0.9219999999890973, 0.8830000000182849 ]
> -5.519999999966103 [ 0.9199999999888178, 0.8800000000187538 ]
> -5.532999999965256 [ 0.9179999999885382, 0.8770000000192226 ]
> -5.545999999964408 [ 0.9159999999882587, 0.8740000000196915 ]
> -5.558999999963561 [ 0.9139999999879791, 0.8710000000201603 ]
> -5.5719999999627134 [ 0.9119999999876995, 0.8680000000206292 ]
> -5.584999999961866 [ 0.90999999998742, 0.865000000021098 ]
> -5.597999999961019 [ 0.9079999999871404, 0.8620000000215668 ]
> -5.610999999960171 [ 0.9059999999868609, 0.8590000000220357 ]
> -5.623999999959324 [ 0.9039999999865813, 0.8560000000225045 ]
> -5.636999999958476 [ 0.9019999999863018, 0.8530000000229734 ]
> -5.649999999957629 [ 0.8999999999860222, 0.8500000000234422 ]
My question is: isn't it the gradient that tells where to go to minimize the function? If such is the case, then why the solution is drifting away? Thanks for your time and any help will be appreciated..