I have a function $$f(w) = \sum\left[(w.\Omega w)^2\right]-\sum\left[w.\Omega w\right]^2$$ where $.$ indicates element wise multiplication, $w$ is a vector, $\Omega$ is a symmetric positive definite matrix.
I need to optimize the above function w.r.t $w$. The size of my problem is small enough to do this with any old non linear optimizer, but I was curious to know what the gradient looks like, should I ever want to feed it in to the optimizer.
So this became a little toy puzzle for me to play with whilst alone in the office, but my persistent failure at finding the answer has become annoying.
In attempting to solve this, I noted that at some point I would need to find $$\frac{\partial (w.\Omega w)}{\partial w}$$
I'm pretty sure I have correctly worked out that this is $$w.\Omega + diag(\Omega w)$$ where $diag(x)$ converts the vector $x$ into a diagonal matrix.
To verify this, I have the following Matlab code. I can translate to R if required for this question.
a = randn(100000,1);
b = a + randn(100000,1);
c = a - b + randn(100000,1);
omega = cov([a b c], 1);
w = [0.1 0.5 0.4];
r = w'.*omega*w';
drdw = omega.*w' + diag(omega*w');
d = 0.000000001;
i = 3;
w_ = w;
w_(i) = w_(i) + d;
r_ = w_'.*omega*w_';
disp((r_ - r) ./ d)
disp(drdw(:,i))
So with that result in my pocket, I focused on the first term of $f$. My reasoning is as follows
$$\frac{\partial \sum\left[(w.\Omega w)^2\right]}{\partial w} = \sum\left[\frac{\partial ((w.\Omega w)^2)}{\partial w}\right]$$ application of chain rule gives $$2.\sum\left[(w.\Omega w) \frac{\partial (w.\Omega w)}{\partial w}\right]$$ and we already have the pocket solution from earlier.
But now when I try to verify this solution, it goes wrong.
r = sum(w'.*omega*w');
i = 3;
drdw = sum((w'.*(omega*w')).*(omega.*w' + diag(omega*w'))) * 2;
d = 0.000000001;
w_ = w;
w_(i) = w_(i) + d;
r_ = sum(w_'.*omega*w_');
disp((r_ - r) ./ d)
disp(drdw(i))
But where does it go wrong?
EDIT
As noted by Abdullah, there was a bug in my code, rather than a flaw in my maths. Correct code is
r = sum((w'.*omega*w').^2);
i = 3;
drdw = sum((w'.*(omega*w')).*(omega.*w' + diag(omega*w'))) * 2;
d = 0.000000001;
w_ = w;
w_(i) = w_(i) + d;
r_ = sum((w_'.*omega*w_').^2);
disp((r_ - r) ./ d)
disp(drdw(i))
I don't have MATLAB installed on my personal laptop and it will take a while to download Octave, so the following is hypothetical, i.e. debugging on paper, sigh.
I think it goes wrong with how you use the sum function of MATLAB. Since you are not passing the sum direction and
(w'.*(omega*w')).*(omega.*w' + diag(omega*w'))is a matrix,sumdoes the following according to the MATLAB reference documentationTry
sum((w'.*(omega*w')).*(omega.*w' + diag(omega*w')),2).Also
r_ = sum(w_'.*omega*w_');seems fishy.w_'.*omega*w_'is a vector, sosumwill return a scalar andr_is supposed to be a vector, too. But I don't know your application so I am not sure.Edit: if $r = \sum [(w.\Omega w)^2]$ then
r = sum(w'.*omega*w');is not the correct MATLAB code to implement it.