Back to Full Curriculum
ML202Semester 43 (2-0-2)Major

Introduction to Machine Learning

Core concepts and taxonomy of machine learning (supervised, unsupervised, reinforcement learning), Bias-variance tradeoff and model capacity, Overfitting, underfitting, and the No Free Lunch theorem, Evaluation metric...

Syllabus

01

Unit 1: Foundations of Machine Learning

Core concepts and taxonomy of machine learning (supervised, unsupervised, reinforcement learning), Bias-variance tradeoff and model capacity, Overfitting, underfitting, and the No Free Lunch theorem, Evaluation metrics (accuracy, precision, recall, F1-score, ROC-AUC, confusion matrix), Cross-validation strategies (k-fold, stratified, time-series), Feature engineering fundamentals (scaling, encoding, feature selection).

02

Unit 2: Linear Models and Regression

Linear regression assumptions and ordinary least squares (OLS) solution, Gradient descent variants (batch, stochastic, mini-batch), Regularization techniques (L1/Lasso, L2/Ridge, Elastic Net), Logistic regression for binary classification, Probability interpretation and maximum likelihood estimation, Multinomial logistic regression and softmax, Model diagnostics (residual analysis, learning curves).

03

Unit 3: Nonlinear Models and Decision Boundaries

Decision trees (CART algorithm, entropy, Gini impurity), Tree ensemble methods (bagging, random forests), Gradient boosting machines (XGBoost fundamentals), Support Vector Machines (maximal margin separator, kernel trick), Kernel functions (linear, polynomial, RBF), Nonlinear separability and the curse of dimensionality.

04

Unit 4: Unsupervised Learning and Clustering

K-means clustering algorithm and limitations, Hierarchical clustering (agglomerative, divisive), Density-based clustering (DBSCAN), Dimensionality reduction techniques (PCA, t-SNE, UMAP), Anomaly detection methods (isolation forest, one-class SVM), Association rule mining (Apriori, FP-growth).

05

Unit 5: Model Selection, Optimization, and Deployment

Hyperparameter optimization (grid search, random search, Bayesian optimization), Ensemble learning principles (stacking, voting), Model interpretability techniques (SHAP, LIME, partial dependence plots), Introduction to ML pipelines and cross-validation pitfalls, MLOps concepts (model versioning, monitoring, retraining), Practical considerations for production ML systems.