Module 1: Fundamentals of Machine Learning

Machine learning brings together statistics, computer science and more...
supervised learning and unsupervised learning
Representation -> Evaluation -> Optimization

scikit-learn
SciPy
NumPy
Pandas
Matplotlib

It is important to examine your dataset before starting to work with it for a machine learning task, because:

To understand how much missing data there is in the dataset
To get an idea of whether the features need further cleaning
It may turn out the problem doesn't require machine learning.

K-Nearest Neighbors Classification
A low value of “k” (close to 1) is more likely to overfit the training data and lead to worse accuracy on the test data, compared to higher values of “k”.
Setting “k” to the number of points in the training set will result in a classifier that always predicts the majority class.
The k-nearest neighbors classification algorithm has to memorize all of the training examples to make a prediction.

Module 2: Supervised Machine Learning

Linear regression
$\hat{y} = \hat{w_0}x_0 + \hat{w_1}x_1 + ...\hat{w_n}x_n +\hat{b}$
Logistic regression
Support Vector Machines
Multi-class classification
Kernelized Support Vector Machine
Cross validation
Decision tree (less preprocssing of data, easy to interpret and visualize)

Module 3: Model Evaluation and Selection

Represent/Train/Evaluate/Refine cycle
Preamble
Dummy Classifier (strategy: most_frequent, stratified, uniform, constant)
Dummy Regressor (strategy: mean, median, quantile, constant)
Confusion matrices & Basic evaluation matrices
1. $Accuracy = \frac{TN+TP}{TN+TP+FN+FP}$
2. $ClassificationError = \frac{FP+FN}{TN+TP+FN+FP}$
3. $Recall = \frac{TP}{TP+FN}$
  Recall is also known as:
- True Positive Rate (TPR)
- Sensitivity
- Probability of detection
1. $Precision = \frac{TP}{TP+FP}$
2. $False positive rate = \frac{FP}{TN+FP}$
  FPR is also known as Specificity
Classifier Decision Functions
Precision recall and ROC curves
Multi-class evaluation, Macro average vs Micro average
Regression evaluation (r2_score usually is enough)
Model selection: Optimizing classifiers for different evaluation metrics

Module 4: Supervised Machine Learning - Part 2

Naive Bayes Classifier (Naive means each feature of an instance is independent of all the others, given the class)
1. Bernoulli, Multinomial (suitable for text classification)
2. Gaussian (suitable for high-dimensional data)
Random Forests
Gradient Boosted Decision Trees
Neural Networks
Data leakage

Optional module: Unsupervised Machine Learning

Dimensionality Reduction and Manifold Learning
K-means clustering
agglomerative clustering
DBSCAN Clustering

Applied Machine Learning in Python