This equation represents the output of an LSTM. It does not have to be the output of the whole network, but rather can act as an input to other parts of a neural network, e.g. linear or recurrent layers.
\(g^\text{output}\) | This symbol represents the state of the output gate of the LSTM. |
\(\mathbf{y}\) | This symbol represents the output activation vector of a neural network. |
\(c\) | This symbol represents the memory cell of an LSTM. |
\(n\) | This symbol represents any given whole number, \( n \in \htmlId{tooltip-setOfWholeNumbers}{\mathbb{W}}\). |
Remember that the memory cell, \(\htmlId{tooltip-memoryCellLSTM}{c}(\htmlId{tooltip-wholeNumber}{n})\) is a vector representing the network's memory. It holds the current state of the network:
\[\htmlId{tooltip-memoryCellLSTM}{c}(\htmlId{tooltip-wholeNumber}{n}+1)=\htmlId{tooltip-forgetGateLSTM}{g^\text{forget}}(\htmlId{tooltip-wholeNumber}{n}+1)\cdot \htmlId{tooltip-memoryCellLSTM}{c}(\htmlId{tooltip-wholeNumber}{n}) + \htmlId{tooltip-inputGateLSTM}{g^\text{input}} (\htmlId{tooltip-wholeNumber}{n}+1) \cdot \htmlId{tooltip-inputNeuronLSTM}{u}(\htmlId{tooltip-wholeNumber}{n}+1)\]
Further, \(\htmlId{tooltip-outputGateLSTM}{g^\text{output}}(\htmlId{tooltip-wholeNumber}{n})\) holds the transformed external signal, e.g. from previous layers:
\[\htmlId{tooltip-outputGateLSTM}{g^\text{output}}(\htmlId{tooltip-wholeNumber}{n}+1) = \htmlId{tooltip-sigmoid}{\sigma}(\htmlId{tooltip-weightMatrix}{\mathbf{W}}^{\htmlId{tooltip-outputGateLSTM}{g^\text{output}}}[1;x^{\htmlId{tooltip-outputGateLSTM}{g^\text{output}}}])\]
See Update of a memory cell in an LSTM and Output gate of an LSTM for more details.
Intuitively, we want the output of an LSTM to be influenced by both its state (memory) and newly acquired signal. We do it in the easiest way possible, by simply multiplying these vectors element-wise.
Let the current memory cell be
\[\htmlId{tooltip-memoryCellLSTM}{c}(\htmlId{tooltip-wholeNumber}{n}) = \begin{bmatrix}0.7 \\0.3\end{bmatrix}\]
and the value of the output gate
\[\htmlId{tooltip-outputGateLSTM}{g^\text{output}}(\htmlId{tooltip-wholeNumber}{n}) = \begin{bmatrix}0.4 \\0.6\end{bmatrix}\]
Then, the output of the whole LSTM block is:
\[\htmlId{tooltip-outputGateLSTM}{g^\text{output}}(\htmlId{tooltip-wholeNumber}{n}) \cdot \htmlId{tooltip-memoryCellLSTM}{c}(\htmlId{tooltip-wholeNumber}{n})\]
After substituting the variables, we obtain the result.
\[\begin{bmatrix}0.7 \\0.3\end{bmatrix} \cdot\begin{bmatrix}0.4 \\0.6\end{bmatrix} = \begin{bmatrix}0.28 \\0.18\end{bmatrix}\]
Was this page helpful?