Bullet Points
Bullet Points
This post lists out various topics under the machine learning subject.
Data
- data modalities
- numbers
- texts
- images
- videos
- audios
- data cleaning
- imbalanced data
- data normalization (one pitfall)
- standardization
- data augmentation
- data splitting
- cross validation
- RANSAC
- feature extraction/engineering
- domain expertise
- kernel method
Model
- supervised vs. unsupervised
- supervised learning
- classification
- regression (linear and non-linear case)
- unsupervised learning
- discovers inherent properties (latent variables) in the data
- includes
- clustering
- dimensionality reduction/manifold embedding
- anomaly detection
- latent-variable model
- supervised learning
- discriminative vs. generative
- classification vs. regression
- from classification model to regression model
- from binary classification to multi-class classification
- linear vs. non-linear
- parametric vs. non-parametric
- ensemble method
- bootstrap aggregating
- random forest
- gradient boosting
- least square boosting
- AdaBoost
- LogitBoost
- bootstrap aggregating
- regularization and overfitting
Evaluation
training criterion
- mean squared error
- cross entropy
- impurity
- Gini index
- entropy
testing criterion
confusion matrix
Real\Pred Positive Negative Positive TP FN Negative FP TN - accuracy
- precision
- recall
- F-score
precision-recall curve
receiver operation curve