I'm currently learning how to program neural networks. In the most basic ones, you initialize your matrix with random numbers that are distributed using Gamma function.
I'm using a math library, that lets me initialize a Gamma function by giving values for shape and rate.
The most basic way to initialize it for the neural networks is to do it with a Gamma function with mean 0 and standard deviation 1.
I'm struggling on how to connect these properties though.
What I have already learned, is that I can obtain my desired mean value with shape and rate like this:
$μ = k/β$
where k is the shape and β is the rate.
So far, i couldn't find / understand how to get a desired standard deviation though.
Extra: If you could explain using 1yr college math (if possible), that would be amazing :)
A gamma distribution has a strictly positive mean. If $X$ is gamma distributed with shape $a$ and rate $b$, then the mean of $X$ is $$\mu = \operatorname{E}[X] = a/b,$$ and the standard deviation is $$\sigma = \sqrt{\operatorname{Var}[X]} = \sqrt{a}/b.$$ Note that $a$ and $b$ must be positive.
It follows from the above that, given a desired mean $\mu$ and standard deviation $\sigma$, the shape and rate that produce a gamma distribution with that desired $\mu$ and $\sigma$ are: $$a = (\mu/\sigma)^2, \quad b = \mu/\sigma^2.$$
For example: If the desired mean is $\mu = 5$ and the desired standard deviation is $\sigma = 2$, then the shape would be $a = (5/2)^2 = 25/4$, and the rate would be $b = 5/2^2 = 5/4$.