Gradient in each cell of 3d box with dependence on neighbours

50 Views Asked by At

I am working on a statistical likelihood problem in my PhD and have been trying to correctly estimate a gradient, but the solution I arrive at does not appear to be correct.

The likelihood is calculated as a single value for a 3-dimensional box $\mathbf{T}$ of size $n_x \times n_y \times n_z$. The vector $\mathbf{j} = \bigl( \begin{smallmatrix} x \\ y \\ z \end{smallmatrix} \bigr)$ specifies the 3d index of a cell $T_\mathbf{j}$ in the box.

I have the following term in my likelihood $\mathscr{L}(\mathbf{T})$ :

$$\mathscr{L}(\mathbf{T}) = \ldots -\sum_\mathbf{i}^n\left(\sum_\mathbf{j}^n T_\mathbf{j} \Xi_{\mathbf{j},\mathbf{i}}\right)^2$$

where $\mathbf{i}$ is a 3d index just like $\mathbf{j}$ and $\Xi_{\mathbf{j},\mathbf{i}}$ is $$\Xi_{\mathbf{j},\mathbf{i}} = \sum_{k=x,y,z}C_k\left(-2\delta_{\mathbf{j},\mathbf{i}} + \delta_{\mathbf{j},\mathbf{i} + \mathbf{\hat{k}}} + \delta_{\mathbf{j},\mathbf{i} - \mathbf{\hat{k}}}\right). $$ Here $C_k$ is a constant, $\mathbf{\hat{k}}$ the unit vector in the dimension, and $\delta$ is the Kronecker delta. The operator $\Xi_{\mathbf{j},\mathbf{i}}$ means that each cell has a contribution from cells 1 step away in $x,y,z$ respectively.

I need to calculate the gradient of the likelihood for each cell $\nabla_\mathbf{j}\mathscr{L}(\mathbf{T})$. For the sake of simplicity, I write the gradient for some cell as $\partial\mathscr{L}/\partial T_\mathbf{m}$. The other terms are simpler, but the last term is where I run into trouble:

\begin{align}\frac{\partial\mathscr{L}}{\partial T_\mathbf{m}} & = \ldots -\frac{\partial}{\partial T_\mathbf{m}}\sum_\mathbf{i}^n\left(\sum_\mathbf{j}^n T_\mathbf{j} \Xi_{\mathbf{j},\mathbf{i}}\right)^2 \\ & = 2\left(\sum_\mathbf{i}^n\sum_\mathbf{j}^n T_\mathbf{j} \Xi_{\mathbf{j},\mathbf{i}}\right)\frac{\partial}{\partial T_\mathbf{m}}\left(\sum_\mathbf{i}^n\sum_\mathbf{j}^n T_\mathbf{j} \Xi_{\mathbf{j},\mathbf{i}}\right) \end{align}

This last factor I have been unable to derive properly. No matter how I try, I find that it ends up being zero, which it shouldn't be. I could very much use some help on this matter.

Thank you.

1

There are 1 best solutions below

0
On BEST ANSWER

You can flatten the $T$ tensor into a vector. The index mapping is straightforward. $$\eqalign{ T &\in {\mathbb R}^{N_x\times N_y\times N_z} \iff x \in{\mathbb R}^{N_xN_yN_z\times 1} \\ x_\beta &= T_{ijk} \\ \beta &\iff (i,j,k) \\ \beta &= i + (j-1)N_x + (k-1)N_xN_y \\ i &= 1 + (\lambda-1)\,{\rm mod}\,N_y \\ j &= 1 + (\lambda-1)\,{\rm div}\;N_y \\ k &= 1 + (\beta-1)\,{\rm div}\;(N_xN_y) \\ &\quad{\rm where}\quad\lambda = 1 + (\beta-1)\,{\rm mod}\,(N_xN_y) \\ }$$ Similarly, you can flatten the sixth-order $\Xi$ tensor into a matrix whose elements are $$\eqalign{ &M_{\beta\alpha} = \Xi_{(ijk)\,(\ell mn)} \\ &\beta \iff (i,j,k) \\ &\alpha \iff (\ell,m,n) \\ }$$ The flattening allows us to approach this as a standard matrix problem. $$\eqalign{ {\cal L} &= M^Tx:M^Tx \\ d{\cal L} &= 2M^Tx:M^Tdx \\ d{\cal L} &= 2MM^Tx:dx \\ \frac{\partial \cal L}{\partial x} &= 2MM^Tx \;=\; g \quad\big({\rm gradient\,vector}\big) \\ }$$ The gradient vector can be mapped into a tensor, i.e. $$\eqalign{ G_{ijk} = g_\beta \\ }$$ or it can be translated into the original variables with those horrible triple-index-vectors $$\eqalign{ \frac{\partial \cal L}{\partial T_{\bf m}} &=\sum_{\bf i}\sum_{\bf j}\;2\;T_{\bf j}\;\Xi_{\bf j,i}\;\Xi_{\bf m,i}\\ \\ }$$


NB:   In several steps, the trace/Frobenius product is denoted by a colon, i.e. $$\eqalign{ A:B &= {\rm Tr}(A^TB) = {\rm Tr}(AB^T) \\ &= \sum_i\sum_j A_{ij}\,B_{ij} \\ }$$