Distance concentration and emptiness phenomenon, Concentration of measure inequality, Nearest neighbor distances in high dimensions, Sparsity in high-dimensional data, Double descent phenomenon, Blessing of dimensiona...
Distance concentration and emptiness phenomenon, Concentration of measure inequality, Nearest neighbor distances in high dimensions, Sparsity in high-dimensional data, Double descent phenomenon, Blessing of dimensionality vs. curse, Dimension reduction necessity, Johnson-Lindenstrauss lemma for random projections.
PCA mathematical formulation (covariance matrix eigendecomposition, SVD), PCA variants (Kernel PCA, Sparse PCA, Robust PCA), Incremental/online PCA, Factor analysis and independent component analysis (ICA), Multidimensional scaling (MDS - classical, non-metric), Isomap and geodesic distances.
t-SNE algorithm (student-t divergence, perplexity tuning), UMAP (uniform manifold approximation and projection), LargeVis and PHATE, Autoencoder architectures (vanilla, variational, denoising), Deep belief networks for representation learning, Self-supervised contrastive learning (SimCLR, MoCo), Manifold learning assumptions and topology preservation.
Multiple testing problem and FDR control (Benjamini-Hochberg procedure), Sparsity-inducing regularization (Lasso, Elastic Net, Group Lasso), Stability selection and knockoffs framework, High-dimensional covariance estimation (graphical models, covariance shrinkage), Robust high-dimensional regression, Sure independence screening (SIS).
Random projection methods (CountSketch, Johnson-Lindenstrauss transforms), Locality-Sensitive Hashing (LSH) families (random hyperplanes, p-stable distributions), Approximate nearest neighbors (HNSW, FAISS), Word embeddings scaling (GloVe, fastText), Graph embeddings (Node2Vec, DeepWalk), Tensor decomposition for multi-modal data.