Jacobian of a matrix vector product for a `scipy` numerical optimization routine

236 Views Asked by At

I have an $N \times K$ matrix of known data $\mathbf{Y}$ and $K$ length vector of unit weights $\mathbf{x}$, and want to optimize the objective function:

$$ f(\mathbf{x}) = \log(\mathbf{y} \mathbf{x}) $$

I can do this using Python with e.g. scipy.minimize

import scipy.optimize as opt
import numpy as np

N = 10
K = 3

logY = np.random.normal(size=30).reshape((N, K))
Y = np.exp(logY)

def f(x):
    return - sum(np.log(Y @ x))

result = opt.minimize(
    fun=f,
    x0=[1/K] * K,
    constraints=dict(type="eq", fun=lambda x: sum(x) - 1), 
    method="SLSQP",
    bounds=[(0, 1) for _ in range(K)],
)

This runs and gives relatively sensible results.

If I want to supply the Jacobian to this routine, I need:

$$ \frac{d f(\mathbf{x})}{\mathbf{x}} = \frac{1}{\mathbf{x}} $$

def jacobian(x):
    return - 1/x # negation for minimization

However, adding this seems to result in failure, with completely uniform values of $\mathbf{x}$.

I have seen in some other code in a similar example the definition of the Jacobian as:

def jacobian2(x):
    deriv = lambda n, k: (Y[n, k] - Y[n, K - 1]) / Y[n] @ x
    partials = np.array([deriv(n, k) for n in range(N) for k in range(K)]).reshape((N, K))
    return -partials.sum(axis=0)

This, however, seems to work. So, how is jacobian2 the right/better answer here? Is the derivative of the objective function $\frac{x_{nk} - x_{nK}}{Y X}$ for each $n$ and $k$?

1

There are 1 best solutions below

0
On BEST ANSWER

I assume that the cost function is the scalar $f(\mathbf{x})=\mathbf{1}^T \log(\mathbf{Yx})$ where $\log$ is applied elementwise on a vector.

The gradient of $f$ wrt $\mathbf{x}$ is $$ \frac{\partial f}{\partial \mathbf{x}}=\mathbf{Y}^T \frac{\mathbf{1}}{\mathbf{Yx}} $$ where again the division must be taken elementwise.