Recurrent neural networks are a class of artificial neural networks which are often used with sequential data. The 3 most common types of recurrent neural networks are vanilla recurrent neural network (RNN), long short-term memory (LSTM) and gated recurrent units (GRU). Here are the 3 GIFs (RNN, LSTM and GRU respectively) to help us understand
Gradient descent is an optimisation method for finding the minimum of a function. It is commonly used in deep learning models to update the weights of the neural network through backpropagation.
In this post, I will summarise the common gradient descent optimisation algorithms used in popular deep learning frameworks (e.g. TensorFlow, Keras, PyTorch, Caffe). The purpose of this post is to make it easy to read and digest (using consistent nomenclature) since there aren’t many of such summaries out there, and as a cheat sheet if you want to implement them from scratch.