Probability Basic Concepts

  Independent Events The probability of an event does not affect the probability of the next event. So for example tossing a coin does not affect the probability of the next flip. Dependent Events Two events are dependent when the probability of one influences the likelihood of the other event. Joint Probability Is the probability…

Motion in Computer Vision

Motion in Computer Vision Motion can be tracked with a 2D motion vector. A vector has a direction and magnitude which will determine the direction and amount of movement between one frame and the next one. First we will have to defive special points to track like for example intersections or corners once we localize…

Image Captioning Project

  In this project I will train a network with the COCO Dataset (Common Objects in Context). This dataset contains images and a set of 5 different captions per image. I will train a CNN-RNN model by feeding it with the image and captions so the network will learn to generate captions given an image. Once trained…

NLTK

NLTK Stands for Natural Language Toolkit. Tokenization is just splitting sentences in a list of words. Word Tokenization with Python built in functions word = text.split() Word Tokenization with NLTK from nltk.tokenize import word_tokenize words = word_tokenize(text) Sencentes Tokenization with NLTK from nltk.tokenize import sent_tokenize words = sent_tokenize(text) NLTK Documentation

Embeddings

An embedding is a mapping from discrete objects, such as words, to vectors of real numbers. For example, a 300-dimensional embedding for English words could include: blue: (0.01359, 0.00075997, 0.24608, …, -0.2524, 1.0048, 0.06259) blues: (0.01396, 0.11887, -0.48963, …, 0.033483, -0.10007, 0.1158) orange: (-0.24776, -0.12359, 0.20986, …, 0.079717, 0.23865, -0.014213) oranges: (-0.35609, 0.21854, 0.080944, …,…

Defining the network and feedforward function

class LSTMTagger(nn.Module): def __init__(self, embedding_dim, hidden_dim, vocab_size, tagset_size): super(LSTMTagger, self).__init__() self.hidden_dim = hidden_dim self.word_embeddings = nn.Embedding(vocab_size, embedding_dim) self.lstm = nn.LSTM(embedding_dim, hidden_dim) self.hidden2tag = nn.Linear(hidden_dim, tagset_size) self.hidden = self.init_hidden() def init_hidden(self): return (torch.zeros(1, 1, self.hidden_dim), torch.zeros(1, 1, self.hidden_dim)) def forward(self, sentence): embeds = self.word_embeddings(sentence) lstm_out, self.hidden = self.lstm(embeds.view(len(sentence), 1, -1), self.hidden) tag_outputs = self.hidden2tag(lstm_out.view(len(sentence), -1)) tag_scores…

Basic LSTM Network

The first layer of a LSTM Network should always be an embedding layer which will take the vocabulary dictionary size as the input. Before we initialize the network we need to define the vocabulary that is simply a dictionary (with unique words) and each word will have a numerical index, to do so we can use the…