Differentiation of functions w.r.t. a composed argument

98 Views Asked by At

I need help with the following derivative involving inner products: $$\frac{d\, \log(x)^T\,y}{d\,x^T\,y}$$ Here $x$ and $y$ are $n$-dimensional vectors, $T$ indicates transpose, and the logarithm of the vector is taken point-wise: $\log(x)_i = \log(x_i)$.

THANKS for your help in advance!!!

3

There are 3 best solutions below

2
On BEST ANSWER

This question isn't undefined or ambiguous at all, it's just a lot of work. Rewrite the given derivative into an equivalent expression with no derivatives except for the $n^2$ atomic expressions $\frac{d\,x_i}{d\,y_j}$ (see footnote)

$$\frac{d\,\sum_{i=1}^n \log(x_i) y_i}{d\,\sum_{j=1}^n x_j y_j}$$

$$\frac{\sum_{i=1}^n d\,\log(x_i) y_i}{\sum_{j=1}^n d\,x_j y_j}$$

$$\sum_{i=1}^n \frac{d\,\log(x_i) y_i}{\sum_{j=1}^n d\,x_j y_j}$$

$$\sum_{i=1}^n \left(\frac{\sum_{j=1}^n d\,x_j y_j}{d\,\log(x_i) y_i}\right)^{-1}$$

$$\sum_{i=1}^n \left(\sum_{j=1}^n \frac{ d\,x_j y_j}{d\,\log(x_i) y_i}\right)^{-1}$$

$$\sum_{i=1}^n \left(\sum_{j=1}^n \frac{ x_j\,d\,y_j + y_j\,d\,x_j}{\log(x_i)\,d\, y_i + y_i/x_i\,d\, x_i}\right)^{-1}$$

$$\sum_{i=1}^n \left(\sum_{j=1}^n \frac{ x_j\,d\,y_j}{\log(x_i)\,d\, y_i + y_i/x_i\,d\, x_i} + \frac{y_j\,d\,x_j}{\log(x_i)\,d\, y_i + y_i/x_i\,d\, x_i}\right)^{-1}$$

$$\sum_{i=1}^n \left(\sum_{j=1}^n \left(\frac{\log(x_i)\,d\, y_i + y_i/x_i\,d\, x_i}{ x_j\,d\,y_j}\right)^{-1} + \left(\frac{\log(x_i)\,d\, y_i + y_i/x_i\,d\, x_i}{y_j\,d\,x_j}\right)^{-1} \right)^{-1}$$

$$\sum_{i=1}^n \left(\sum_{j=1}^n \left( \frac{\log(x_i)}{ x_j}\frac{d\, y_i}{d\,y_j} + \frac{y_i/x_i}{ x_j}\frac{d\, x_i}{d\,y_j} \right)^{-1} + \left( \frac{\log(x_i)}{y_j}\frac{d\, y_i}{d\,x_j} + \frac{y_i/x_i}{y_j}\frac{d\, x_i}{d\,x_j} \right)^{-1} \right)^{-1}$$

A special case of this (probably the one you are interested in) is where all of the variables are independent, that is: $$(\forall i, j)\frac{d\,x_i}{d\,y_j} = 0$$ $$(\forall i \ne j)\frac{d\,x_i}{d\,x_j} = \frac{d\,y_i}{d\,y_j} = 0$$ $$(\forall i = j)\frac{d\,x_i}{d\,x_j} = \frac{d\,y_i}{d\,y_j} = 1$$

Edit: I believe the above derivation is only correct when the variables have a functional relationship, like $y = f(x)$. When they are independent, it no longer seems to be correct, and I apologize, I haven't figured out how to solve the problem in that case yet.


Footnote:

Another way of thinking about the expression: At any point in the logic, you should be able to apply an arbitrary set of consistent relationships between the variables and get a consistent result. For example, consider if the poster had asked for the simpler expression:

$$\begin{align} \frac{d\,xy}{d\,x + y} &= \frac{xd\,y + yd\,x}{d\,x + d\,y}\\ &= \frac{x\,d\,y}{d\,x + d\,y} + \frac{y\,d\,x}{d\,x + d\,y}\\ &= \left(\frac{d\,x + d\,y}{x\,d\,y}\right)^{-1} + \left(\frac{d\,x + d\,y}{y\,d\,x}\right)^{-1}\\ &= \left(\frac{d\,x}{x\,d\,y} + \frac{d\,y}{x\,d\,y}\right)^{-1} + \left(\frac{d\,x}{y\,d\,x} + \frac{d\,y}{y\,d\,x}\right)^{-1}\\ &= x\left(\frac{d\,x}{d\,y} + 1\right)^{-1} + y\left(1 + \frac{d\,y}{d\,x}\right)^{-1}\\ \end{align}$$

Continuing the example, you are still free to add a new assumption, such as $y = e^{kx}$.

$$\begin{align} \frac{d\,xy}{d\,x + y} & = \frac{d\,xy}{d\,x} \left(\frac{d\,x + y}{d\,x}\right)^{-1}\\ &= \frac{d\,x\,e^{kx}}{d\,x} \left(\frac{d\,x + e^{kx}}{d\,x}\right)^{-1}\\ &=\frac{(k\,x\,+1)e^{kx}}{ke^{kx} + 1} \end{align}$$

And applying the new assumption to the result: $$\begin{align} x\left(\frac{d\,x}{d\,y} + 1\right)^{-1} + y\left(1 + \frac{d\,y}{d\,x}\right)^{-1} & = x\left(k^{-1}e^{-kx} + 1\right)^{-1} + e^{kx}\left(1 + ke^{kx}\right)^{-1}\\ & = x\left(\frac{ke^{kx}}{ke^{kx} + 1}\right) + e^{kx}\left(\frac{k^{-1}e^{-kx}}{k^{-1}e^{-kx} + 1}\right)\\ & = kx\left(\frac{e^{kx}}{ke^{kx} + 1}\right) + e^{kx}\left(\frac{1}{1 + ke^{kx}}\right)\\ & = \frac{(kx+1)e^{kx}}{ke^{kx} + 1} \end{align}$$

Although this is only 1 example, it suggests that this sort of calculus is well defined and consistent. It's just not the limited sort of calculus-of-functions that you see in schools.

1
On

$(\log x)^T*y=\log(x_1^{y^1}.x_2^{y_2}.\ldots.x_n^{y_n})$ So you have to differentiate $x_1^{y_1}.x_2^{y_2}.\ldots.x_n^{y_n}$ with respect to $x_1y_1+x_2y_2+\ldots+x_ny_n$.

1
On

$\def\nR{\mathbb{R}}\def\sign{\operatorname{sign}}\def\l{\left}\def\r{\right}\def\ltag#1{\tag{#1}\label{#1}}$ The only interpretation of the question I can think of is that you are given a vector of unknowns $z=[x,y]$ and two functions $f(z)=\log(x)^T\cdot y$ and $g(z)=x^T y$ and you should determine $a(z)$ such that $$ \ltag{quest} \frac{df(z)}{dg(z)} = a(z) $$ where $df$ and $dg$ are to be interpreted as differentials. In this case you have \begin{align} df(z) &= a(z)\cdot dg(z)\\ f'(z)\cdot dz &= a(z)\cdot g'(z)\cdot dz\\ (f'(z)-a(z)g'(z))\cdot dz = 0 \end{align} with $f'(z),g'(z)\in \nR^{1\times N}$. Note, that this equation is a generalized eigenvalue problem with $a(z)$ as the eigenvalue and $dz\in\nR^N$ as eigen direction.

In our case it is just a single equation which we can tackle directly. \begin{align} f'(z) &= \begin{pmatrix} (y./x)^T,&\log(x)^T \end{pmatrix}\\ g'(z) &= \begin{pmatrix} y^T&x^T \end{pmatrix} \end{align} We have to satisfy the equation \begin{align} \ltag{d} ((y./x)^T-a(z)\cdot y^T)\cdot dx + (\log(x)^T-a(z)\cdot x^T)\cdot dy &= 0 \end{align} Let us assume the differentials $d x,dy$ were not restricted. Then the equations \begin{align} ((y./x)^T-a(z)\cdot y^T)&=0\ltag{allDir1}\\ (\log(x)^T-a(z)\cdot x^T)&=0\ltag{allDir2} \end{align} would all have to be satisfied. The first component of equation \eqref{allDir1} reads as $$ \ltag{a1} a = \frac1{x_1} $$ But, the other components of \eqref{allDir1} imply $x_k=\frac1{a}=x_1$ for $k=2,\ldots,n$ and the other equation \eqref{allDir2} says \begin{align} \ltag{a2} a&=\frac{\log(x_k)}{x_k}&&\text{ for }k=1,\ldots,n. \end{align} Equations \eqref{a1} and \eqref{a2} are not solvable together.

Therfore, all solutions of equation \eqref{d} restrict the differential $dz$.

I think equation \eqref{quest} imposes further $g'(z)dz\neq 0$. One could ask which differentials $dz$ and which values $a$ are admissible in this case.

For an instance the hyperplane $y=0$ is not admissible even if it looks good for \eqref{d}. It is not admissible since with $y=0$ we also have $dy=0$ and $g'(z)dz = y^Tdx + x^Tdy =0$.

Let us look at a hyperplane with $y=c$ for some fixed $c\neq 0\in\nR^n$. Because of $dy=0$ we need $0\neq g'(z)dz=c^Tdx$.

In this case \eqref{d} gives the equation \begin{align} \sum_{k=1}^n \frac{c_k dx_k}{x_k}-ac_kdx_k = 0 \end{align} Let us try $a=0$ and let us assume that just two components of $c_k$ are nonzero, e.g., $c_1=1$ and $c_2=-1$. We would end up with the equation \begin{align} \frac{dx_1}{x_1} = \frac{dx_2}{x_2}. \end{align}

With the requirement $c_1\cdot dx_1 + c_2 \cdot dx_2\neq 0$ there would locally be at least one of the differentials $dx_1 , dx_2$ nonzero (the considered tangent vector to the manifold at hand would have a nonzero component in this direction). Integration yields \begin{align} \int_{x_{10}}^{x_1} \frac{d \bar x_1}{\bar x_1} &= \int_{x_{20}}^{x_2} \frac{d \bar x_2}{\bar x_2}\\ \log\l(\frac{x_1}{x_{10}}\r) &= \log\l(\frac{x_2}{x_{20}}\r)\\ \ltag{sur} \frac{x_1}{x_{10}} &= \frac{x_2}{x_{20}} \end{align} One case for an admissible solution is $a=0$ with a tangent vector $dz$ on plane hyper-surfaces with $y=\begin{pmatrix}1&-1&0&\ldots&0\end{pmatrix}^T$ satisfying equation \eqref{sur} and $x_1,x_2\neq 0$.

This is not a complete characterization of the solution set of \eqref{quest}. But 1st it gives a reasonable interpretation of the question and 2nd gives some insight in the structure of the solution set of \eqref{quest}.


Note, that differentials can be interpreted as components of tangent vectors as it is described in my answer at https://physics.stackexchange.com/questions/92925/how-to-treat-differentials-and-infinitesimals/93025#93025.