Core concepts and taxonomy of machine learning (supervised, unsupervised, reinforcement learning), Bias-variance tradeoff and model capacity, Overfitting, underfitting, and the No Free Lunch theorem, Evaluation metric...
Core concepts and taxonomy of machine learning (supervised, unsupervised, reinforcement learning), Bias-variance tradeoff and model capacity, Overfitting, underfitting, and the No Free Lunch theorem, Evaluation metrics (accuracy, precision, recall, F1-score, ROC-AUC, confusion matrix), Cross-validation strategies (k-fold, stratified, time-series), Feature engineering fundamentals (scaling, encoding, feature selection).
Linear regression assumptions and ordinary least squares (OLS) solution, Gradient descent variants (batch, stochastic, mini-batch), Regularization techniques (L1/Lasso, L2/Ridge, Elastic Net), Logistic regression for binary classification, Probability interpretation and maximum likelihood estimation, Multinomial logistic regression and softmax, Model diagnostics (residual analysis, learning curves).
Decision trees (CART algorithm, entropy, Gini impurity), Tree ensemble methods (bagging, random forests), Gradient boosting machines (XGBoost fundamentals), Support Vector Machines (maximal margin separator, kernel trick), Kernel functions (linear, polynomial, RBF), Nonlinear separability and the curse of dimensionality.
K-means clustering algorithm and limitations, Hierarchical clustering (agglomerative, divisive), Density-based clustering (DBSCAN), Dimensionality reduction techniques (PCA, t-SNE, UMAP), Anomaly detection methods (isolation forest, one-class SVM), Association rule mining (Apriori, FP-growth).
Hyperparameter optimization (grid search, random search, Bayesian optimization), Ensemble learning principles (stacking, voting), Model interpretability techniques (SHAP, LIME, partial dependence plots), Introduction to ML pipelines and cross-validation pitfalls, MLOps concepts (model versioning, monitoring, retraining), Practical considerations for production ML systems.