Consider the following equation $$ \mathbf{P}(\mathbf{A}\circ\mathbf{A})= \hat{\mathbf{P}} $$
where $\mathbf{P}$, $\mathbf{A}$ and $\hat{\mathbf{P}}$ are all square matrices with positive elements. I want to solve for $\mathbf{A}$. Due to the Schur product in there I don't think I can do so analytically (though if I can that would be great and would resolve all of my problems). The Schur product is there because the matrix $\mathbf{A}\circ\mathbf{A} = \tilde{\mathbf{A}}$ needs to have all positive elements. Given these constraints, I've decided to try and find $\mathbf{A}$ by gradient descent, thereby minimising the Frobenius norm of the difference between the LHS and RHS of the equation. $$ \frac{d}{d\mathbf{A}}\|\mathbf{P}(\mathbf{A}\circ\mathbf{A}) - \hat{\mathbf{P}}\|_{F} = 0 $$
I would like to know if its possible to show that this optimisation has a unique minima. It seems to me that would be possible by showing the function is convex everywhere, but then in the context of the matrix equation I'm not convinced I know how to do that. I would have thought that showing the function is convex in each individual element of $\mathbf{A}$ would do the trick?
Otherwise, I believe the typical argument for a function being convex if its 2nd derivative is +ve, but in this case would I have to show that the 2nd derivative is positive in all its elements? Or something like the 2nd derivative is positive definite?
Let's define $F(A)=P(A\odot A)-\hat{P}$. Using http://www.matrixcalculus.org, we can see that:
$$\frac{\partial}{\partial A}\|F(A)\|_2 = 2\frac{P^TF(A)\odot A}{\|F(A)\|_2}$$
This is what you need to solve. If you can ensure that no element of $A$ is zero then it simplifies to $P^TF(A)=0$ or $F(A)=0$ if $P$ is inversible, which is not really helpful. However, it provides you with the gradient and a gradient descent might be doable.