LSTM in PyTorch

To define a LSTM: lstm = nn.LSTM(input_size=input_dim, hidden_size=hidden_dim, num_layers=n_layers) To initialize the hidden state: h0 = torch.randn(1, 1, hidden_dim) c0 = torch.randn(1, 1, hidden_dim) We will need to wrap everything in Variable, input is a tensor inputs = Variable(inputs) h0 = Variable(h0) c0 = Variable(c0) Get the outputs and hidden state out, hidden = lstm(inputs,…

LSTM Cells

LSTM Cells will replace hidden layers on a Recurrent Neural Network, and can be stacked so you can have multiple hidden layers, all of them being LSTM cells. It is comprised of 4 gates with 2 inputs and 2 outputs: Learn Gate: it takes the Short term memory and the Event and combines them with a than function and then ignores a…

Hyperparameters

Is a variable that we need to set to a value before we can train a Neural Network. There are no magic numbers, it all depends on the architecture, data and problem to solve, etc. Optimizer Hyperparameters learning rate Is the most important hyperparameter of all, typical values are: 0.1, 0.01, 0.001, 0.0001, 0.00001 and…

Dropout

Will randomly turn on and off nodes in each layer (with some specified probability) each epoch during the feedforward and backpropagation, that means that the node that was disabled will not contribute to the prediction and will neither get the weighst updated during backpropagation, this will help the model generalize better and increase accuracy on…

Kaggle

To Download datasets from Kaggle you can use the: kaggle-cli You can install it with pip: pip install kaggle-cli or to upgrade it using: pip install kaggle-cli –upgrade The only problem with it is that it will download all of the dataset files, which can be huge (more than 20 GBs). Another way to do…

Basic recipe for ML

Basic recipe for ML Does the model have high bias? (training data performance) Try a bigger network, train longer (more epochs), try a different NN architecture, keep doing it until the training data fits well. Does the model have High Variance? (Dev Set performance) Add more data, data augmentation, try regularization, NN architecture. For High…

Bias and Variance

  When looking at the train set error and comparing to the cross validation error we could have the following situations:Train set error: 1% (the model is doing very well) Cross Val error: 11%There is a big variance in error between the training set and cross val. set, this is an example of High Variance, and…