How to find the gradient of the following function \begin{align} f(W) := \sum_{j \neq {t}} \left[ \max\left(0, [ W x ]_j - \left[ W x \right]_{t} + \delta \right) \right] + \lambda \left\| W \right\|_F^2 \ , \end{align} w.r.t. $W \in \mathbb{R}^{m \times n}$ matrix, where $x \in \mathbb{R}^n$ and $\delta, \lambda$ are known variables. The $j$th element of a vector $y \in \mathbb{R}^n$ is denoted as $[y]_j$.
2026-03-28 03:33:18.1774668798
Gradient of $f(W)=\sum_{j \neq {t}} \left[ \max\left(0, [ W x ]_j - \left[ W x \right]_{t} + \delta \right) \right] + \lambda \left\| W \right\|_F^2$
86 Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail At
1
There are 1 best solutions below
Related Questions in MULTIVARIABLE-CALCULUS
- Equality of Mixed Partial Derivatives - Simple proof is Confusing
- $\iint_{S} F.\eta dA$ where $F = [3x^2 , y^2 , 0]$ and $S : r(u,v) = [u,v,2u+3v]$
- Proving the differentiability of the following function of two variables
- optimization with strict inequality of variables
- How to find the unit tangent vector of a curve in R^3
- Prove all tangent plane to the cone $x^2+y^2=z^2$ goes through the origin
- Holding intermediate variables constant in partial derivative chain rule
- Find the directional derivative in the point $p$ in the direction $\vec{pp'}$
- Check if $\phi$ is convex
- Define in which points function is continuous
Related Questions in MATRIX-CALCULUS
- How to compute derivative with respect to a matrix?
- Definition of matrix valued smooth function
- Is it possible in this case to calculate the derivative with matrix notation?
- Monoid but not a group
- Can it be proved that non-symmetric matrix $A$ will always have real eigen values?.
- Gradient of transpose of a vector.
- Gradient of integral of vector norm
- Real eigenvalues of a non-symmetric matrix $A$ ?.
- How to differentiate sum of matrix multiplication?
- Derivative of $\log(\det(X+X^T)/2 )$ with respect to $X$
Trending Questions
- Induction on the number of equations
- How to convince a math teacher of this simple and obvious fact?
- Find $E[XY|Y+Z=1 ]$
- Refuting the Anti-Cantor Cranks
- What are imaginary numbers?
- Determine the adjoint of $\tilde Q(x)$ for $\tilde Q(x)u:=(Qu)(x)$ where $Q:U→L^2(Ω,ℝ^d$ is a Hilbert-Schmidt operator and $U$ is a Hilbert space
- Why does this innovative method of subtraction from a third grader always work?
- How do we know that the number $1$ is not equal to the number $-1$?
- What are the Implications of having VΩ as a model for a theory?
- Defining a Galois Field based on primitive element versus polynomial?
- Can't find the relationship between two columns of numbers. Please Help
- Is computer science a branch of mathematics?
- Is there a bijection of $\mathbb{R}^n$ with itself such that the forward map is connected but the inverse is not?
- Identification of a quadrilateral as a trapezoid, rectangle, or square
- Generator of inertia group in function field extension
Popular # Hahtags
second-order-logic
numerical-methods
puzzle
logic
probability
number-theory
winding-number
real-analysis
integration
calculus
complex-analysis
sequences-and-series
proof-writing
set-theory
functions
homotopy-theory
elementary-number-theory
ordinary-differential-equations
circles
derivatives
game-theory
definite-integrals
elementary-set-theory
limits
multivariable-calculus
geometry
algebraic-number-theory
proof-verification
partial-derivative
algebra-precalculus
Popular Questions
- What is the integral of 1/x?
- How many squares actually ARE in this picture? Is this a trick question with no right answer?
- Is a matrix multiplied with its transpose something special?
- What is the difference between independent and mutually exclusive events?
- Visually stunning math concepts which are easy to explain
- taylor series of $\ln(1+x)$?
- How to tell if a set of vectors spans a space?
- Calculus question taking derivative to find horizontal tangent line
- How to determine if a function is one-to-one?
- Determine if vectors are linearly independent
- What does it mean to have a determinant equal to zero?
- Is this Batman equation for real?
- How to find perpendicular vector to another vector?
- How to find mean and median from histogram
- How many sides does a circle have?
The max function isn't differentiable, but you could replace it with a smooth-max approximation. Or you could try this...
Define the variables $$\eqalign{ y &= Wx, \quad dy = dW\,x \cr k &= t, \quad z = (1e_k^T)y = Ky \cr b &= (y-z + 1\delta) = (I-K)y + 1\delta \cr s &= {\rm sign}(b), \quad p = \tfrac{1}{2}(1+s) \cr a &= p\odot b = \max(0,b)\cr }$$ where max() and sign() are applied elementwise and the latter is defined as $${\rm sign}(\beta)=\begin{cases} +1 & \beta \ge 0 \\ -1 & \beta \lt 0 \end{cases}$$
Write the function in terms of these variables and find its gradient $$\eqalign{ f &= 1:a + \lambda W:W - \delta \cr &= p:b + \lambda W:W - \delta \cr \cr df &= p:db + 2\lambda W:dW \cr &= p:(I-K)dy + 2\lambda W:dW \cr &= p:(I-K)dW\,x + 2\lambda W:dW \cr &= \Big((I-K)^Tpx^T + 2\lambda W\Big):dW \cr \frac{\partial f}{\partial W} &= \Big(I-e_k1^T\Big)px^T + 2\lambda W \cr }$$ NB: The above derivation uses the symbol {$\odot$} for the elementwise/Hadamard product, {$e_k$} for the standard basis vectors, {$1$} for the all-ones vector, and {:} for the trace/Frobenius product, i.e. $$\eqalign{ A:B = {\rm Tr}(A^TB) }$$ Instead of a smooth-max function, you could re-work the above approach with a smooth-sign function, e.g. $\tanh(\mu b)$ with a sufficiently large $\mu$-parameter. $$\lim_{\mu\to\infty}\tanh(\mu b) = {\rm sign}(b)$$