Jacobian of a quadratic vector field

827 Views Asked by At

I am trying to calculate the partial derivative of a function of several variables, and it is pushing my understanding of matrix calculus. The function is the following

$$f(x) = M D(x) R x$$

where $D(x)$ is the diagonal matrix with the vector $x = (x_1,\dots,x_n)$ on the main diagonal, and $M$ and $R$ are $n \times n $ real matrices. What I am looking for is the matrix of partial derivatives

$$\frac{\partial f(x)}{\partial x_i}$$

I can derive this by expanding the above into non-matrix notation, but it is quite messy and I can't figure out how to simplify it. Ideally I'd like to have $\partial f(x) / \partial x_i$ in terms of $M$ and $R$.

I'm hoping this is a fairly straightforward application of matrix calculus rules, but I can't seem to find any useful way of dealing with this combined function of matrix. Thanks!

3

There are 3 best solutions below

2
On BEST ANSWER

I think expanding matrix multiplication, as you have tried, is a good idea, I have not been able to find a closed form solution for this case. What you are looking for is the Jacobian matrix ($J_f \in \mathbb{R^{n \times n}}$):

$$ J_f = \frac{\partial f}{\partial x} = \frac{\partial}{\partial x}\left( M D(x) R x\right)=M \frac{\partial}{\partial x}\left(D(x) R x\right)$$

Now, you can develop with indexes term $R x$, which is a column vector, as follows:

$$(R x)_j = \sum_k^n R_{j,k}x_k$$

And, intermediately, you can compute $D(x)R x$ as:

$$(D(x) R x)_j = \sum_k^n x_j R_{j,k}x_k$$

Now, you can take derivatives w.r.t. $x_i$, such as:

$$\left(\frac{\partial D(x) R x}{\partial x}\right)_{i,j} = \frac{\partial (D(x) R x)_j}{\partial x_i} = \sum_k^n R_{j,k} \frac{\partial}{\partial x_i} (x_j x_k)$$

Now, you can differentiate two cases:

  • Case 1 $i \neq j$ (out off diagonal terms): In this case, you have $\frac{\partial}{\partial x_i} (x_j x_k) = x_j \delta_{i,k}$:

$$\left(\frac{\partial D(x) R x}{\partial x}\right)_{i,j} =R_{j,i} x_j \equiv (D(x)R)_{i,j}$$

  • Case 2 $i = j$ (diagonal terms): In this case, you have $\frac{\partial}{\partial x_i} (x_j x_k) = x_k - x_j \delta_{j,k}+2x_j\delta_{j,k}=x_k +x_j\delta_{j,k}$:

$$\left(\frac{\partial D(x) R x}{\partial x}\right)_{i,j} = \left(\sum_k^n R_{j,k} x_k\right) + R_{j,j}x_j \equiv (D(x)R)_{i,j}+(Rx)_{i,j}$$

So, recapping results, and putting then in matrix form:

$$\left(\frac{\partial D(x) R x}{\partial x}\right)_{i,j} = \delta_{i,j}R_{i,j}x_j + (D(x)R)_{i,j} = D(x)R + \text{diag}(R x)$$

Leading to:

$$ J_f = M[D(x)R+\text{diag}(Rx)]$$

Edit: To correct calculation error. Special thanks to @daw.

0
On

Let vector field $\mathrm f : \mathbb R^n \to \mathbb R^n$ be defined by

$$\mathrm f (\mathrm x) := \mathrm A \mbox{diag} (\mathrm x) \mathrm B \mathrm x$$

The directional derivative of $\rm f$ in the direction of $\rm v$ at $\rm x$ is

$$\lim_{h \to 0} \frac{\mathrm f (\mathrm x + h \mathrm v) - \mathrm f (\mathrm x)}{h} = \mathrm A \mbox{diag} (\mathrm x) \mathrm B \mathrm v + \mathrm A \mbox{diag} (\mathrm v) \mathrm B \mathrm x$$

Since

$$\mbox{diag} (\mathrm u) \mathrm v = \mbox{diag} (\mathrm v) \mathrm u$$

we obtain

$$\lim_{h \to 0} \frac{\mathrm f (\mathrm x + h \mathrm v) - \mathrm f (\mathrm x)}{h} = \mathrm A \mbox{diag} (\mathrm x) \mathrm B \mathrm v + \mathrm A \mbox{diag} (\mathrm B \mathrm x) \mathrm v = \mathrm A \left( \mbox{diag} (\mathrm x) \mathrm B + \mbox{diag} (\mathrm B \mathrm x) \right) \mathrm v$$

and, thus, the Jacobian matrix is

$${\mathrm J}_{\mathrm f} (\mathrm x) := \color{blue}{\mathrm A \left( \mbox{diag} (\mathrm x) \mathrm B + \mbox{diag} (\mathrm B \mathrm x) \right)}$$

0
On

A quick numerical (julia) test of Carlos' and Rodrigo's solutions.

n=3;
x=8*randn(n); M=38*randn(n,n); R=88*randn(n,n); dx=randn(n)*1e-6; 
D=diagm(x); dD=diagm(dx); Dr=diagm(diag(R));
f = M*D*R*x; df = M*(D+dD)*R*(x+dx) - f;
J1 = M*(D*R + diagm((R+Dr)*x));  # Carlos's  jacobian 
J2 = M*(D*R + diagm(R*x));       # Rodrigo's jacobian

df
3-element Array{Float64,1}:
  0.105202 
 -0.0797321
  0.300291 

J1*dx
3-element Array{Float64,1}:
  0.152478 
 -0.0989752
  0.351571 

J2*dx
3-element Array{Float64,1}:
  0.105202 
 -0.0797322
  0.300291 

The answers differ by one term, i.e. Carlos uses $${\rm diag}((R+D_R)x)$$ in place of Rodrigo's $${\rm diag}((R)x)$$