It can not only process single data points (such as images), but also entire sequences of data (such as speech or video).
For example, LSTM is applicable to tasks such as unsegmented, connected handwriting recognition or speech recognition.
Unlike standard feedforward neural networks, LSTM has feedback connections that make it a “general purpose computer” (that is, it can compute anything that a Turing machine can).
A common LSTM unit is composed of a remembering cell, an input gate, an output gate and a forget gate.
The previous cell state forgets some of itself by multiplying with the forget gate and is added to new information through the input gates.
LSTMs deal with the exploding and vanishing gradient problems that can be encountered when training traditional RNNs.