Deriving Dot Products

35 Views Asked by At

I'm currently going through an ML course and I had a question about derivatives when it comes to dot products. I have a function $$ p(\mathbb{w}, x) = \frac{1}{1+e^{-\mathbb{w}\cdot x}} $$ Where both $w$ and $x$ are $n$-dimensional vectors. When I took the derivative, I got: $$ \frac{\partial}{\partial\mathbb{w}}p(\mathbb{w}, x) = -\frac{e^{-\mathbb{w}\cdot x}}{(1+ e^{-\mathbb{w}\cdot x})^2}x $$ After this, I wanted to take the second derivative and this is where I got confused. to take the second derivative, I would need to multiply by $x$ again, but I don't thing thats right because $x$ is a vector and you cant't multiply 2 $n$-dimensional vectors together because their dimensions dont match up. How would I go about doing this?

1

There are 1 best solutions below

0
On

If $f$ is a real-valued function with first and second derivative $f'$ and $f''$, respectively. Furthermore, let $w,x\in\mathbb R^n$. The derivative of $f(w\cdot x)$ wrt $w$ is given by $$\frac{\partial}{\partial w} f(w\cdot x) = f'(w\cdot x)\frac{\partial}{\partial w}(w\cdot x) = f'(w\cdot x)x.$$ Similarily, for the second derivative, we have that $$\left(\frac{\partial}{\partial w}\right)^2 f(w\cdot x) = f''(w\cdot x)(x\otimes x),$$ where $\otimes$ denotes the Kronecker product.

It is import to be aware of the derivative operator. Here I used $\left(\frac{\partial}{\partial w}\right)^k = \mathrm{vec}\left(\frac{\partial}{\partial w}\left(\frac{\partial}{\partial w}\right)^{k-1}\right)$, where $\mathrm{vec}$ is the vectorization operator. Another way of thinking about the derivative operator is $\left(\frac{\partial}{\partial w}\right)^2 = \frac{\partial}{\partial w\partial w'}$. The result then would be $f''(w\cdot x)xx'$. I prefer the first one, because it mimics the rules of the usual univariate derivative operator by thinking of the resulting multiplication in terms of a Kronecker product. Moreover, the second notion becomes very cumbersome when considering third order derivatives.