0-intro
Roadmap
- What is the difference between machine learning and optimization? - Machine learning training is exactly an optimization process. But machine learning additionally takes into consideration the adaptation of the trained model from training data to unknown test data. 
- Statistical model (multi-variate central limit theorem exemplified with Gaussian and multinomial distribution) and its limitation in machine learning analytics - The analysis mostly focuses on asymptotic scenarios, and does not provide non-asymptotic guarantees. - The analysis requires a well-behaved statistical model, e.g. a normal or multinomial distribution. However, real-world image, text, and sound distributions could be more complex. - The analysis aims to find the entire probability distribution, which could be too costly to compute and analyze. An approach to directly analyze the error could be more feasible for large-scale machine learning models and datasets. For example, an approximation bound guarantee other than a statistical guarantee is also useful. 
- What is machine learning theory about? - Analysis of finite training data falls under probability, statistics and information theory. - Analysis of learning models falls under functional analysis and signal processing. - Analysis of computing algorithms falls under optimization and computation theory. 
- What is this course about? - Theory of model-based statistical learning: exponential families, maximum likelihood, method of moments, maximum entropy principle - Theory of model-free machine learning: uniform convergence bounds, VC dimension, Rademacher complexity, covering numbers - Theory of representation: kernel functions and methods, approximation in deep learning - Theory of convergence: optimization and generalization in deep learning, convex vs. non-convex machine learning problems