I am confused about the partial derivatives that I'm using in my deep learning model. The formula of the sigmoid is: $i_t = \sigma(W_{ii}*x_t+b_{ii} + W_{hi}*h_{t-1}+b_{hi})$
Notation meaning:
- $\sigma$ = Sigmoid Function
- $W_{ii}$ = Weight Input
- $x_t$ = Input data
- $b_{ii}$ = Bias Input
- $W_{hi}$ = Weight Hidden
- $h_{t-1}$ = Hidden State previous Time Step
- $b_{ii}$ = Bias Hidden
I've been searched that the regular sigmoid function partial derivatives is: $\frac{d}{d_x} = \sigma(x) . (1-\sigma(x))$
Then, I'm trying to do a partial derivative of the $i_t$ formula w.r.t $W_{ii}$, my current answer is: $\frac{d}{dw_{ii}} = \sigma(x_t) . (1-\sigma(x_t))$
Am I do wrong? If I do wrong, can someone help me to correct my answer?
UPDATE: I've been try again and this is my step right now:
- Assume that $z = W_{ii}*x_t+b_{ii} + W_{hi}*h_{t-1}+b_{hi}$
- Then the $i_t$ formula will be: $i_t = \sigma(z)$
- Partial derivative will be: $\frac{d}{d_z}i_t = \sigma(z) . (1-\sigma(z))$
- On the other hand, the partial derivative of the $z$ is: $\frac{d}{dw_{ii}}z = x_t $
- Finally, the final partial derivative formula is: $\frac{d}{dw_{ii}}i_t = \sigma(z) . (1-\sigma(z)) . x_t$
But, I don't know whether it's correct or wrong. Can someone correct my updated answer? Thank you in advance