### Scikit Learn

Logistic Regression Is used for classification problems, outputs probabilities if > 0.5 the data is labeled 1 else it is labeled 0. Produces a linear decision boundary. from sklearn.linear_model import LogisticRegression from sklearn.model_selection import train_test_split logreg = LogisticRegression() X_train, X_text, y_train, y_test = train_test_split(X,y,test_size=0.2,random_state=42) logreg.fit(X_train,y_train) y_pred= logreg.predict(X_test) ROC Curve Stands for Receiver Operating Characteristics curve…

### Fast.AI

Fast.AI Update Feb-2019 Deep learning V3 Kernels Lesson 1 Pets Lesson 2 Download Lesson 2 SGD Lesson 3 Camvid-tiramisu Lesson 3 Camvid Lesson 3 Head-Pose Lesson 3 Planet Lesson 3 Tabular Lesson 4 Collab Lesson 4 Tabular Lesson 5 SGD-MNIST Lesson 6 Pets-more Rossmann data clean Lesson 6 Rossmann Lesson 7 Human-numbers [Lesson 7…

### Reinforcement Learning

Reinforcement Learning Reinforcement Learning optimizes an agent for sparse, time delayed labels called rewards in an environment. Markov Decision Processes or MDP for short, are a mathematical framework for modeling decisions using states, actions and rewards. Q Learning is a strategy that finds the optimal action selection policy for any MDP.

### This is a bad Omega constraint matrix

This is a bad Omega constraint matrix It is definitely not correct, as we can see the landmarks positions are almost all with zeros, used this to debug my code, it is a singular matrix (does not have an inverse matrix) [[ 6. 1. -1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.…

### PyTorch

PyTorch We can define our own PyTorch modules, to do so we need to inherit from nn.Module To have a fully functional PyTorch layer we can create a constructor and call the parent class constructor. super().__init__() Then all we need to do is to define the forward function and return the results of a forward pass.…

### Momentum

Momentum In gradient descent momentum uses a constant \beta between 0 and 1, and it is used to calculate the next step size, it will weight previous steps, so the previous step matthers a lot and the weight for each previous step will then decrease and this is done by using the the constant \beta…

### Key ML Terminology

Key ML Terminology Feature: features are the input variables we feed into a network, it can be as simple as a signle number or more complex as an image (which in reality is a vector of numbers, where each pixel is a feature) Label: is the thing we are predicting, it is normally refered as y…

### Learning rates

Cosine Annealing: uses cos/2 (so half of the cosine function) to decrease the learning rate as it trains. We can use Cosine Annealing on Stochastic Gradiend Descent with warm restarts, that means that when a cycle ends the learning rate will jump again to the highest learning rate (restart on top of the cosine function) and…

### 100DaysOfMLCode Index

100DaysOfMLCode Index Attention mechanisms Batch size CNNs Computer Vision Conda Data Augmentation Defining a network structure Downloading Datasets Dropout Embeddings FastAI Filtered Images Facia-Keypoints-Detector notes GPU States High bias & high variance Hyperparameters Image Captioning Project Notes Intro to Pandas Lab Jobs in Computer Vision Layer Shapes Learning Rates Localization LSTM cells Momentum Machine…

### Weighted Loss Functions

Weighted Loss Functions For image classification and localization we need to use 2 loss functions on the same network!, to calculate the predicted class we would use categorical cross entropy but to find the bounding box (which is a regression problem) we need to use something like an SME, L1 Loss or smooth L1 Loss.…