DS501Semester 63 (2-0-2)Major

Predictive Modeling & Analytics

Unit 1: Predictive Modeling Foundations

Supervised vs. unsupervised learning review, Regression vs. classification frameworks, Model evaluation metrics (MAE, RMSE, R² for regression; precision, recall, F1, AUC for classification), Cross-validation strategies (k-fold, stratified, time-series), Bias-variance tradeoff revisited, Overfitting detection (learning curves, validation monitoring), Model selection criteria (AIC, BIC, cross-validated scores).

Unit 2: Ensemble Methods and Bagging

Bagging fundamentals and variance reduction, Random Forests (feature bagging, bootstrap aggregating), Extra Trees and feature importance interpretation, Out-of-bag error estimation, Gradient Boosting Machines (GBM) - forward stage-wise additive modeling, XGBoost architecture (regularized objectives, second-order gradients), LightGBM and CatBoost optimizations.

Unit 3: Time Series Forecasting

Time series components (trend, seasonality, cyclicity, irregularity), Stationarity testing (ADF, KPSS tests), ARIMA/SARIMA models (ACF, PACF analysis, differencing), Exponential smoothing methods (Holt-Winters), Prophet forecasting framework, LSTM/GRU for sequential forecasting, Cross-validation for time series (purged/time-series splits), Anomaly detection in temporal data.

Unit 4: Survival Analysis and Imbalanced Learning

Survival function, hazard rates, Kaplan-Meier estimator, Cox proportional hazards model, Time-to-event prediction metrics (C-index, Brier score), Imbalanced classification techniques (SMOTE, ADASYN, undersampling), Cost-sensitive learning, Anomaly detection algorithms (isolation forest, one-class SVM, autoencoders), Class imbalance evaluation (PR-AUC, Matthews correlation coefficient).

Unit 5: Model Deployment, Monitoring, and MLOps

Model serialization (joblib, ONNX), Containerization (Docker for ML models), API serving (Flask, FastAPI, TensorFlow Serving), Model monitoring (drift detection, performance degradation), A/B testing and champion/challenger patterns, Automated retraining pipelines, Explainable AI techniques (SHAP, LIME, partial dependence), Production ML challenges (data versioning, lineage tracking).

Top skills

Data StructuresAlgorithmsDeep LearningStatisticsRobotics

Structure

Semester6

Credits3 (2-0-2)

CategoryMajor