I want to solve for the following problem $$\min_{X \in \mathbb{R}^{2 \times2}}\{h(A(X))+\|X\|_*\}$$ where $\|\cdot\|_*$ is nuclear norm and $h:\mathbb{R}^{2}\to \mathbb{R}$: $h(y)=\frac{1}{2}\|B^{1/2}y-B^{-1/2}d\|_2^2$ for some positive definite matrix $B$ \begin{bmatrix} 3/2 & -2\\ -2 & 3 \end{bmatrix} and a vector $d$ \begin{bmatrix} 5/2\\ -1 \end{bmatrix} ${}{}{}{}{}{}$ And $A:\mathbb{R}^{2\times 2}\to \mathbb{R}^2$ is a linear operator given by $$A(X)=(X_{11},X_{22})$$ To solve this problem, we need to make use of the first order optimality condition, however I did not figure out how to compute the subdifferential of the first term. Basically, it is a derivative of a vector to a matrix, but the notes from the Wiki did not make sense to me. Anyone can show this for me? Many thanks
2026-03-25 09:24:13.1774430653
The Derivative of a Matrix Function with Nuclear Norm Term
125 Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail At
1
There are 1 best solutions below
Related Questions in DERIVATIVES
- Derivative of $ \sqrt x + sinx $
- Second directional derivative of a scaler in polar coordinate
- A problem on mathematical analysis.
- Why the derivative of $T(\gamma(s))$ is $T$ if this composition is not a linear transformation?
- Does there exist any relationship between non-constant $N$-Exhaustible function and differentiability?
- Holding intermediate variables constant in partial derivative chain rule
- How would I simplify this fraction easily?
- Why is the derivative of a vector in polar form the cross product?
- Proving smoothness for a sequence of functions.
- Gradient and Hessian of quadratic form
Related Questions in OPTIMIZATION
- Optimization - If the sum of objective functions are similar, will sum of argmax's be similar
- optimization with strict inequality of variables
- Gradient of Cost Function To Find Matrix Factorization
- Calculation of distance of a point from a curve
- Find all local maxima and minima of $x^2+y^2$ subject to the constraint $x^2+2y=6$. Does $x^2+y^2$ have a global max/min on the same constraint?
- What does it mean to dualize a constraint in the context of Lagrangian relaxation?
- Modified conjugate gradient method to minimise quadratic functional restricted to positive solutions
- Building the model for a Linear Programming Problem
- Maximize the function
- Transform LMI problem into different SDP form
Related Questions in MATRIX-DECOMPOSITION
- Real eigenvalues of a non-symmetric matrix $A$ ?.
- Swapping row $n$ with row $m$ by using permutation matrix
- Block diagonalizing a Hermitian matrix
- $A \in M_n$ is reducible if and only if there is a permutation $i_1, ... , i_n$ of $1,... , n$
- Simplify $x^TA(AA^T+I)^{-1}A^Tx$
- Diagonalize real symmetric matrix
- How to solve for $L$ in $X = LL^T$?
- Q of the QR decomposition is an upper Hessenberg matrix
- Question involving orthogonal matrix and congruent matrices $P^{t}AP=I$
- Singular values by QR decomposition
Related Questions in NUCLEAR-NORM
- How does minimizing the rank of a matrix help us impute missing values in it?
- Conjugate of the rank of a matrix
- Low-rank matrix satisfying linear constraints linear mapping
- Equivalence of computing trace norm of matrix
- Prove that nuclear norm of a matrix is equal to the sum of squares of Frobenius norm
- Nuclear norm and Schatten norm in practice
- Derivative of the nuclear norm ${\left\| {XA} \right\|_*}$ with respect to $X$
- When is the Frobenius norm bounded by the nuclear norm?
- "Shadow prices" interpretation of the dual certificate of nuclear norm optimization
- If matrix $A$ has entries $A_{ij}=\sin(\theta_i - \theta_j)$, why does $\|A\|_* = n$ always hold?
Trending Questions
- Induction on the number of equations
- How to convince a math teacher of this simple and obvious fact?
- Find $E[XY|Y+Z=1 ]$
- Refuting the Anti-Cantor Cranks
- What are imaginary numbers?
- Determine the adjoint of $\tilde Q(x)$ for $\tilde Q(x)u:=(Qu)(x)$ where $Q:U→L^2(Ω,ℝ^d$ is a Hilbert-Schmidt operator and $U$ is a Hilbert space
- Why does this innovative method of subtraction from a third grader always work?
- How do we know that the number $1$ is not equal to the number $-1$?
- What are the Implications of having VΩ as a model for a theory?
- Defining a Galois Field based on primitive element versus polynomial?
- Can't find the relationship between two columns of numbers. Please Help
- Is computer science a branch of mathematics?
- Is there a bijection of $\mathbb{R}^n$ with itself such that the forward map is connected but the inverse is not?
- Identification of a quadrilateral as a trapezoid, rectangle, or square
- Generator of inertia group in function field extension
Popular # Hahtags
second-order-logic
numerical-methods
puzzle
logic
probability
number-theory
winding-number
real-analysis
integration
calculus
complex-analysis
sequences-and-series
proof-writing
set-theory
functions
homotopy-theory
elementary-number-theory
ordinary-differential-equations
circles
derivatives
game-theory
definite-integrals
elementary-set-theory
limits
multivariable-calculus
geometry
algebraic-number-theory
proof-verification
partial-derivative
algebra-precalculus
Popular Questions
- What is the integral of 1/x?
- How many squares actually ARE in this picture? Is this a trick question with no right answer?
- Is a matrix multiplied with its transpose something special?
- What is the difference between independent and mutually exclusive events?
- Visually stunning math concepts which are easy to explain
- taylor series of $\ln(1+x)$?
- How to tell if a set of vectors spans a space?
- Calculus question taking derivative to find horizontal tangent line
- How to determine if a function is one-to-one?
- Determine if vectors are linearly independent
- What does it mean to have a determinant equal to zero?
- Is this Batman equation for real?
- How to find perpendicular vector to another vector?
- How to find mean and median from histogram
- How many sides does a circle have?
$\renewcommand{\Re}{\mathbb{R}}$ Some general observations:
For function $\phi:\Re^n\to\Re^m$ we have the Jacobian matrix while at the same time $\Re^{n\times n}$ is isomorphic to $\Re^{n^2}$ which is a space of vectors rather than a space of matrices. Therefore, there does exist a notion of derivative for vector fields.
Subgradients are well defined for functions $f:\mathcal{H}\to\Re\cup\{+\infty\}$ where $\mathcal{H}$ is a Hilbert space, therefore, optimality conditions for optimization problems over spaces of matrices are well posed.
We can see $A$ as an operator $A:\Re^{2\times 2} \ni X \mapsto AX\in\Re^2,$ or equivalently as $$ \tilde{A}:\Re^4\ni X = \begin{bmatrix}X_{11}\\X_{12}\\X_{21}\\X_{22}\end{bmatrix}\mapsto \tilde{A}X = \begin{bmatrix}X_{11}\\X_{22}\end{bmatrix} \in \Re^2. $$
Now define a function $\phi:\Re^4 \to \Re$ as $\phi(X) = h(\tilde{A}(X))$; it is then possible to obtain its derivative $\nabla \phi(X) = \tilde{A}^*\nabla h(\tilde{A}X) = \tilde{A}^* (B^{1/2}y-B^{-1/2}d)$, where $\tilde{A}^*$ is the adjoint linear operator of $\tilde{A}$. Notice that $\nabla \phi$ is a function from $\Re^4$ to $\Re^4$ which can be seen as a function from $\Re^{2\times 2}$ to $\Re^{2\times 2}$.
To be more precise, the gradient of the function $\Phi:\Re^{2\times 2}: X \mapsto h(A(X)) \in \Re$ is the function $$ \nabla \Phi: \Re^{2\times 2} \ni X \mapsto \Phi(X) = A^* \nabla(AX) \in \Re^{2\times 2}, $$ where $A^*:\Re^2\to\Re^{2\times 2}$ is the adjoint linear operator of $A$.
Overall, I would recommend to write down the optimality conditions using the proximal operator of the nuclear norm.
See Neal Parikh, Stephen Boyd - Proximal Algorithms, Section 6.7.3.