I'm trying to compute the gradient and Hessian of the following function
$$f(x,y) = \frac{1}{2}|Ax-By|^2$$
where $A$ and $B$ are $m \times n$ matrices, $x, y \in \mathbb{R}^n$, and $f: \mathbb{R}^{2n} \to \mathbb{R}$.
I honestly don't have a clue on the best way to proceed. Usually, to find the gradient, I would rewrite the function in sums and derive from there - but the square and multiple vector arguments have me stumped. I am not looking for a solution but rather a hint on where to start.
Furthermore, am I right in thinking that $\nabla f(x,y)$ is a vector in $\mathbb{R}^{2n}$ consisting of the partial derivatives along $x$ and $y$, and $\nabla^2 f(x,y)$ to be a $2n \times 2n$ matrix?
Thank you in advance.
An answer for the gradient.
Assimilating vectors and column vectors (as you do) :
$$f(x,y) := \frac{1}{2}\|Ax-By\|^2=\frac{1}{2}(Ax-By)^T(Ax-By)=$$
$$\frac{1}{2}(x^TA^T-y^TB^T)(Ax-By)$$ $$=\frac{1}{2}\left(x^T(A^TA)x-\underbrace{(x^TA^TBy+y^TB^TAx)}_{2x^TA^TBy}+y^T(B^TB)y\right)\tag{1}$$
Let us now apply 2 classical results :
1) the gradient of $x^TMx$ with respect to $x$ is $2x^TM$, seen as a row vector. Why that ? Consider the (Taylor) expansion, where $h$ is a vector increment:
$$\underbrace{(x+h)^TM(x+h)}_{f(x+h)}=\underbrace{x^TMx}_{f(x)}+\underbrace{x^TMh+h^TMx}_{(2x^TM)h=f'(x).h}+\underbrace{h^TMh}_{\text{2nd order term}}$$
2) The gradient of $x^TMy$ with respect to $y$ is row vector $x^TM$, for a similar reason.
Using these two results, the gradient of (1) is (indeed!) a $2n$ dimensional row vector which is:
Remarks :
1) Besides, yes, the Hessian is a $2n \times 2n$ matrix.
2) A different derivation for (1) could have been done by writing :
$$f(x,y) := \frac{1}{2}\|Ax-By\|^2= \frac{1}{2}\begin{pmatrix}x^T \ \ y^T\end{pmatrix}\begin{pmatrix}A^T\\-B^T\end{pmatrix}\begin{pmatrix}A \ \ -B\end{pmatrix}\begin{pmatrix}x\\y\end{pmatrix}$$
$$= \frac{1}{2}\begin{pmatrix}x^T \ \ y^T\end{pmatrix}\begin{pmatrix}A^TA&-A^TB\\-B^TA&B^TB\end{pmatrix}\begin{pmatrix}x\\y\end{pmatrix}$$