Back to Full Curriculum
ML301Semester 53 (2-0-2)Major

Deep Learning & Neural Networks

Perceptron and multi-layer perceptron (MLP), Activation functions (sigmoid, tanh, ReLU, Leaky ReLU, Swish), Forward propagation and backpropagation algorithm derivation, Gradient descent optimization (SGD, momentum, A...

Syllabus

01

Unit 1: Neural Networks Fundamentals

Perceptron and multi-layer perceptron (MLP), Activation functions (sigmoid, tanh, ReLU, Leaky ReLU, Swish), Forward propagation and backpropagation algorithm derivation, Gradient descent optimization (SGD, momentum, AdaGrad, RMSprop, Adam), Vanishing/exploding gradients problem, Weight initialization strategies (Xavier, He), Universal approximation theorem.

02

Unit 2: Convolutional Neural Networks

Motivation for CNNs in image processing, Convolution operation and feature maps, Pooling layers (max, average, global), CNN architectures (LeNet, AlexNet, VGG, ResNet residual connections), Transfer learning and fine-tuning strategies, Data augmentation techniques, Batch normalization and layer normalization, Object detection fundamentals (sliding window, region proposals).

03

Unit 3: Recurrent Networks and Sequence Models

RNN architecture and vanishing gradient problem, Long Short-Term Memory (LSTM) cells and gates, Gated Recurrent Units (GRU), Bidirectional RNNs, Sequence-to-sequence models with attention mechanisms, Beam search decoding, Transformer architecture (self-attention, multi-head attention, positional encoding), BERT and GPT model families.

04

Unit 4: Advanced Architectures and Generative Models

Generative Adversarial Networks (GANs) - minimax game, training stability, DCGAN improvements, Variational Autoencoders (VAEs) - encoder-decoder, KL divergence, Diffusion models fundamentals (forward/reverse diffusion process, DDPM), Autoencoders and dimensionality reduction, Graph Neural Networks (GNN) introduction, Model compression techniques (pruning, quantization).

05

Unit 5: Training, Optimization, and Deployment

Loss functions (cross-entropy, MSE, Dice, contrastive losses), Learning rate schedules and schedulers, Gradient clipping, Mixed precision training, Distributed training strategies (data parallel, model parallel), Overfitting prevention (dropout, early stopping, data augmentation), Model evaluation for deep learning (confusion matrix, precision-recall curves), TensorRT/ONNX deployment, Edge deployment considerations.