How to do multi summation

213 Views Asked by At

I came across the following equation for neural networks: $$ J(\theta) = \frac{-1}{2m} [\sum_{i=1}^m y^{(i)}log(h_\theta(x^{(i)}) + (1-y^{(i)})log(1-h_\theta(x^{(i})] + \frac{\lambda}{2m} \sum_{l=1}^{L-1} \sum_{i=1}^{S_l} \sum_{j=1}^{S_l + 1} (\theta_{ji}^{l})^2 $$

I don't understand how to do the $\sum_{l=1}^{L-1} \sum_{i=1}^{S_l} \sum_{j=1}^{S_l + 1} (\theta_{ji}^{l})^2$, because of the numerous sums.

How do you do it?

Thanks

3

There are 3 best solutions below

3
On BEST ANSWER

Perhaps it would be easier to understand the concept by looking at an

Example:


Assume $M=3,N=2,P = 2$.

Then the sum $\displaystyle\sum_{i=1}^{M}\sum_{j=1}^{N}\sum_{k=1}^{P}x_{ij}^k$ , also sometimes denoted as $ \ \displaystyle\sum_{i,j,k=1}^{M,N,P} x_{ij}^k \,$, can be expanded as: $$ \sum_{i,j,k=1}^{M,N,P} x_{ij}^k = \sum_{i=1}^{M} \left(\sum_{j=1}^{N} \bigg(\sum_{k=1}^{P} x_{ij}^k \bigg) \right) = \sum_{i=1}^{M} \left(\sum_{j=1}^{N} \bigg( x_{ij}^1 + x_{ij}^2 \bigg) \right) = \sum_{i=1}^{M} \left(x_{i1}^1 + x_{i1}^2 + x_{i2}^1 + x_{i2}^2\right) = \\ = x_{11}^1 + x_{11}^2 + x_{12}^1 + x_{12}^2 + x_{21}^1 + x_{21}^2 + x_{22}^1 + x_{22}^2 + x_{31}^1 + x_{31}^2 + x_{32}^1 + x_{32}^2. $$


It is actually possible to write such sums in a more compact form by alternating upper and lower indices. For more details, see Einstein Summation Convention.

0
On

Apply $\sum_{j=1}^{S_l + 1} (\theta_{ji}^{l})^2$ first, you will get something like

$\sum_{j=1}^{S_l + 1} (\theta_{ji}^{l})^2 = (\theta_{1i}^{l})^2 + (\theta_{2i}^{l})^2 + ... + (\theta_{{S_l+1}i}^{l})^2 $

And $\sum_{l=1}^{L-1} \sum_{i=1}^{S_l} \sum_{j=1}^{S_l + 1} (\theta_{ji}^{l})^2$ =$\sum_{l=1}^{L-1} \sum_{i=1}^{S_l} [(\theta_{1i}^{l})^2 + (\theta_{2i}^{l})^2 + ... + (\theta_{{S_l+1}i}^{l})^2 ]$

Then the second last summation on i, then the first summation on l.

0
On

Keeping the outer summation aside, for each $l$, the two inner summations over $i$ and $j$ is like computing the Frobenius norm of a matrix $\Theta^l$, whose $(i,j)$-th element is $\theta^l_{ji}$. Then your triple summation can be viewed as the summation of Frobeniuss norm of matrices $\Theta^1,\Theta^2, \cdots,\Theta^{L-1} $