NLTK

NLTK Stands for Natural Language Toolkit. Tokenization is just splitting sentences in a list of words. Word Tokenization with Python built in functions word = text.split() Word Tokenization with NLTK from nltk.tokenize import word_tokenize words = word_tokenize(text) Sencentes Tokenization with NLTK from nltk.tokenize import sent_tokenize words = sent_tokenize(text) NLTK Documentation

Defining the network and feedforward function

class LSTMTagger(nn.Module): def __init__(self, embedding_dim, hidden_dim, vocab_size, tagset_size): super(LSTMTagger, self).__init__() self.hidden_dim = hidden_dim self.word_embeddings = nn.Embedding(vocab_size, embedding_dim) self.lstm = nn.LSTM(embedding_dim, hidden_dim) self.hidden2tag = nn.Linear(hidden_dim, tagset_size) self.hidden = self.init_hidden() def init_hidden(self): return (torch.zeros(1, 1, self.hidden_dim), torch.zeros(1, 1, self.hidden_dim)) def forward(self, sentence): embeds = self.word_embeddings(sentence) lstm_out, self.hidden = self.lstm(embeds.view(len(sentence), 1, -1), self.hidden) tag_outputs = self.hidden2tag(lstm_out.view(len(sentence), -1)) tag_scores…

Basic LSTM Network

The first layer of a LSTM Network should always be an embedding layer which will take the vocabulary dictionary size as the input. Before we initialize the network we need to define the vocabulary that is simply a dictionary (with unique words) and each word will have a numerical index, to do so we can use the…

LSTM in PyTorch

To define a LSTM: lstm = nn.LSTM(input_size=input_dim, hidden_size=hidden_dim, num_layers=n_layers) To initialize the hidden state: h0 = torch.randn(1, 1, hidden_dim) c0 = torch.randn(1, 1, hidden_dim) We will need to wrap everything in Variable, input is a tensor inputs = Variable(inputs) h0 = Variable(h0) c0 = Variable(c0) Get the outputs and hidden state out, hidden = lstm(inputs,…

LSTM Cells

LSTM Cells will replace hidden layers on a Recurrent Neural Network, and can be stacked so you can have multiple hidden layers, all of them being LSTM cells. It is comprised of 4 gates with 2 inputs and 2 outputs: Learn Gate: it takes the Short term memory and the Event and combines them with a than function and then ignores a…