I know how recurrent neural networks work but let's say that I want to model their behaviour from a statistical point o view, how should I interpret their output? Surprisingly on the internet I found only websites explaining their internal behaviour but nothing regarding their statistical interpretation. Now I'll try to model the output but, since i'm not so experienced in this field, i need someone to correct me.
Basically we have a sequence $s_i = \{x_1, x_2, ..., x_n\}$ which is paired to a label $y_i \in \{0,1\}$. The goal of the task is to predict which label is associated with the sequence. In other words we want a model that computes:
$$P(y=1 | s_i) = P(y=1 | x_1, x_2,...,x_n)$$
The recurrent neural network is able to do that by updating the its output according to the sample of the sequence analyzed at time $t$ and the previous hidden state.
$$P_t(y=1 | x_1,...x_t) = f(h_{t-1}, x_t)$$
The hidden state computed at the timestep $t-1$ takes into account all the samples of the sequence before the timestep $t$. Consequently, the hidden state will be updated:
$$h_t = g(h_{t-1}, x_t)$$
When the end of the sequence is reached, the output of the model will be:
$$P_n(y=1|x_1,...x_n) = P(y=1|s_i)$$
At this point I want to ask you: am I right? Is this a correct interpretation of what a reccurent neural network is trying to do?
If yes, is it possible to do the reverse operation? For example, the convolutional layer has a counter part which is the deconvolutional layer, does something similar exist for recurrent neural networks? So that from a value or a vector I can have a model that returns me a possible sequence? Thank you in advance, I tried to be as clear as possible. I hope I've achieved the goal.