We have $n \times n$ matrix $L$, given by the following Gaussian kernel $$L_{i,j} = \exp\left(-\frac{(x_i-x_j)^2} {2\sigma ^2} \right)$$
where the points $x_i$ and $x_j$ are real numbers that can be thought as positions of points $i$ and $j$. ( $L$ can be seen as a covariance matrix, describing covariance depending on distance between points. The higher distance between points $i$ and $j$ the less the covariance.)
I am interested in finding $$ \frac{\partial\det(L)}{\partial x_i}$$
From Matrix Book, can I use Jacobi's formula? $$\frac{\partial\det(L)}{\partial x}= \det (L) \operatorname{Tr}\left( L^{-1} \frac{\partial L}{\partial x}\right).$$
Is it correct that $\frac{\partial\det(L)}{\partial x_i} = \det (L) \operatorname{Tr}\left( L^{-1} \frac{\partial L}{\partial x_i}\right)$
where $\frac{\partial L}{\partial x_i}$ is the matrix given by the terms $\frac{\partial L_{k,l}}{\partial x_i}=0$ for $k,l \neq i$
and $\frac{\partial L_{i,j}}{\partial x_i}=\frac{\partial L_{j,i}}{\partial x_i}=-\frac{(x_i-x_j)}{ \sigma ^2} L_{i,j}$?
Collect the $x_i$ vectors into a single matrix $$X = \big[\matrix{x_1&x_2\ldots&x_n}\big]$$ Following this post, define the Gram matrix and extract its main diagonal into a vector $$G=X^TX,\quad g=\operatorname{diag}(G)$$ Take the log of $L$ and differentiate $$\eqalign{ L &= \exp\left(\frac{2G -g{\tt1}^T -{\tt1}g^T}{2\sigma^2}\right) \\ 2\sigma^2\log(L) &= 2G -g{\tt1}^T -{\tt1}g^T \\ 2\sigma^2\left(\frac{dL}{L}\right) &= 2\,dG -dg\,{\tt1}^T -{\tt1}\,dg^T \\ dL &= \frac{1}{2\sigma^2}L\odot(2\,dG - dg\,{\tt1}^T - {\tt1}\,dg^T) \\ }$$ For later convenience, define the variables $$\eqalign{ \alpha &= \left(\frac{\phi}{2\sigma^2}\right)\\ R &= L\odot L^{-T} \quad &\big({\rm Hadamard\,Product}\big)\\ P &= \Big(\operatorname{Diag}(R{\tt1})-R\Big) \quad &\big({\rm Laplacian\,of\,}R\big) \\ }$$ Start with the formula for the derivative of the determinant and substitute $dL$ from above. $$\eqalign{ \phi &= \det(L) \\ d\phi &= \phi\,L^{-T}:dL \\ &= \phi\,L^{-T}:(2\,dG - dg\,{\tt1}^T - {\tt1}\,dg^T)\odot\frac{L}{2\sigma^2} \\ &=\alpha R:\left(2\,dG -dg\,{\tt1}^T -{\tt1}\,dg^T\right) \\ &= 2\alpha R:dG - 2R{\tt1}:dg \\ &= 2\alpha R:dG - 2\alpha R{\tt1}:\operatorname{diag}(dG) \\ &= 2\alpha \Big(R -\operatorname{Diag}(R{\tt1})\Big):dG \\ &= -2\alpha P:dG \\ &= -2\alpha P:(X^TdX+dX^TX) \\ &= -4\alpha P:X^TdX \\ &= -4\alpha XP:dX \\ \frac{\partial \phi}{\partial X} &= -4\alpha XP \\ &= -\left(\frac{2\phi}{\sigma^2}\right)XP \\ }$$ So that's the formula for the gradient wrt the $X$ matrix.
To find the gradient with respect one of its columns, multiply by the standard basis vector $e_i$ $$\eqalign{ x_i &= Xe_i \\ \frac{\partial \phi}{\partial x_i} &= \left(\frac{\partial \phi}{\partial X}\right)e_i \;=\; -\left(\frac{2\phi}{\sigma^2}\right)XPe_i \\ }$$ NB: In several steps, a colon was used as a product notation for the trace operator, i.e. $$\eqalign{ A:B &= \operatorname{Tr}(A^TB) }$$ Use was also made of the fact that $\{G,L,M,P\}$ are all symmetric matrices.
The matrix $R=\big(L\odot L^{-T}\big)$ is known as the relative gain array and has some interesting uses in control theory. One of its properties is $R{\tt1}={\tt1}$, which allows some terms to be simplified. $$\eqalign{ \operatorname{Diag}(R{\tt1}) &= I \\ P &= \big(I-R\big) \\ }$$