Vanilla Neural Networks have fixed-sized vector as input and produce a fixed-sized vector as output. They assume inputs and outputs are independent.
Recurrent neural nets allow us to operate over sequences of vectors: sequences in the input, the output, or both.
Examples:
(1) Vanilla. (2) Image to sentence of words. (3) Sentence to classification. (4) Sentence translation. (5) Video frames to labeled frames.
They are trained on sequential or time series data where prior elements affect the next output.
Recurrent connections: the output of a neuron at one time step is fed back as input to the network at the next time step.
Recurrent unit (v): a single hidden vector updated at each time step.
The recurrent unit serves as a form of memory that is based on the current input and the previous hidden state.
RNN Example from StatQuest
Weights in the recurrent vector are uptated with backpropagation.
Partial derivatives cause gradients of earlier weights to be calculated with increasingly many multiplications.