01
Unit 1: High-Dimensional Phenomena and Curse of Dimensionality
Distance concentration and emptiness phenomenon, Concentration of measure inequality, Nearest neighbor distances in high dimensions, Sparsity in high-dimensional data, Double descent phenomenon, Blessing of dimensionality vs. curse, Dimension reduction necessity, Johnson-Lindenstrauss lemma for random projections.
02
Unit 2: Principal Component Analysis and Classical Methods
PCA mathematical formulation (covariance matrix eigendecomposition, SVD), PCA variants (Kernel PCA, Sparse PCA, Robust PCA), Incremental/online PCA, Factor analysis and independent component analysis (ICA), Multidimensional scaling (MDS - classical, non-metric), Isomap and geodesic distances.
03
Unit 3: Nonlinear Dimensionality Reduction
t-SNE algorithm (student-t divergence, perplexity tuning), UMAP (uniform manifold approximation and projection), LargeVis and PHATE, Autoencoder architectures (vanilla, variational, denoising), Deep belief networks for representation learning, Self-supervised contrastive learning (SimCLR, MoCo), Manifold learning assumptions and topology preservation.
04
Unit 4: High-Dimensional Statistics and Regularization
Multiple testing problem and FDR control (Benjamini-Hochberg procedure), Sparsity-inducing regularization (Lasso, Elastic Net, Group Lasso), Stability selection and knockoffs framework, High-dimensional covariance estimation (graphical models, covariance shrinkage), Robust high-dimensional regression, Sure independence screening (SIS).
05
Unit 5: Scalable Algorithms and Embeddings
Random projection methods (CountSketch, Johnson-Lindenstrauss transforms), Locality-Sensitive Hashing (LSH) families (random hyperplanes, p-stable distributions), Approximate nearest neighbors (HNSW, FAISS), Word embeddings scaling (GloVe, fastText), Graph embeddings (Node2Vec, DeepWalk), Tensor decomposition for multi-modal data.