Multivariable Gradient Descent

648 Views Asked by At

If I have a function $S: \mathbb{R}^2 \to \mathbb{R}$ that describes energy falloff in space.

I have a source $S$ positioned at $(S_x, S_y)$

and the intensity at any given point in space (x, y) is

$$ S(x, y) = \frac{1}{(S_x - x)^2 + (S_y - y)^2} $$

Now, I've measured the intensity of the space so I know the value of all $S(x, y)$; however, I don't know the position of the source $(S_x, S_y)$. I'd like to make an initial guess of the source's position and use gradient descent to find the position.

Since I know the intensity at every point in space I can make an energy function

$$E(S_x, S_y) = \sum_x \sum_y {\left(S(x, y) - \frac{1}{(S_x - x)^2 + (S_y - y)^2}\right)^2}$$

I'd like to know how to continue with this problem. The problem I'm having is I know intuitively that I need a gradient direction vector to tell me which way to step but if I take the derivative of my energy function I'll get a scalar function back which will give me how much but not how much to move in each direction.

1

There are 1 best solutions below

0
On BEST ANSWER

I'm sure this was solved awhile ago, but the key is that one needs a gradient, rather than a (total) derivative.

I'll just rewrite your energy function as: $$ E(p,q) = \sum_x\sum_y \left[ S(x,y) - \frac{1}{(p-x)^2+(q-y)^2} \right]^2 $$ Then the gradient is a vector, given by: $$ \nabla E(p,q) = \begin{bmatrix} \partial_pE \\ \partial_qE \end{bmatrix} $$ In this case, I think we get: \begin{align*} \partial_p E &= \frac{\partial}{\partial p} \sum_x\sum_y \left[ S(x,y) - \frac{1}{(p-x)^2+(q-y)^2} \right]^2 \\ &= \sum_x\sum_y 2\left[ S(x,y) - \frac{1}{(p-x)^2+(q-y)^2} \right] \frac{\partial}{\partial p}\left( S(x,y) - \frac{1}{(p-x)^2+(q-y)^2} \right)\\ &= \sum_x\sum_y 2\left[ S(x,y) - \frac{1}{(p-x)^2+(q-y)^2} \right] \frac{\partial}{\partial p}\left( - [{(p-x)^2+(q-y)^2}]^{-1} \right)\\ &= \sum_x\sum_y 2\left[ S(x,y) - \frac{1}{(p-x)^2+(q-y)^2} \right] \left( [{(p-x)^2+(q-y)^2}]^{-2}(2p-x) \right)\\ &= 2\sum_x\sum_y \left[ S(x,y) - \frac{1}{(p-x)^2+(q-y)^2} \right] \left( \frac{2p-x}{[{(p-x)^2+(q-y)^2}]^{2}} \right) \end{align*} So by symmetry: $$ \partial_q E = 2\sum_x\sum_y \left[ S(x,y) - \frac{1}{(p-x)^2+(q-y)^2} \right] \left( \frac{2q-y}{[{(p-x)^2+(q-y)^2}]^{2}} \right) $$ Now, suppose you start at some guess value $(p_0,q_0)$. Then we run gradient descent from $t=1$ until convergence: $$ \begin{bmatrix} p_t \\ q_t \end{bmatrix} = \begin{bmatrix} p_{t-1} \\ q_{t-1} \end{bmatrix} - \eta_t \nabla E(p_{t-1},q_{t-1}) $$ where $\eta_t$ is the step size at time $t$.