Image formation and digitization (sampling, quantization), Pixel representations and color spaces (RGB, HSV, Lab, grayscale), Spatial domain filtering (linear, nonlinear filters, smoothing, sharpening), Edge detection...
Image formation and digitization (sampling, quantization), Pixel representations and color spaces (RGB, HSV, Lab, grayscale), Spatial domain filtering (linear, nonlinear filters, smoothing, sharpening), Edge detection (Sobel, Prewitt, Canny, Laplacian of Gaussian), Histogram equalization and contrast enhancement, Morphological operations (erosion, dilation, opening, closing).
2D Discrete Fourier Transform (DFT) and properties, Fast Fourier Transform (FFT) implementation, High-pass/low-pass filtering in frequency domain, Homomorphic filtering for illumination correction, Discrete Cosine Transform (DCT) for compression, Wavelet transforms (Haar, Daubechies), Multi-resolution analysis and pyramid representations.
Corner detection (Harris, Shi-Tomasi), Blob detection (LoG, DoG, Hessian), SIFT (scale-space extrema, keypoint description), SURF and ORB for real-time applications, HOG (Histogram of Oriented Gradients) for pedestrian detection, Feature matching (nearest neighbor, FLANN), RANSAC for robust estimation, Bag-of-visual-words model.
Pinhole camera model and intrinsic/extrinsic parameters, Camera calibration (Zhang's method, checkerboard patterns), Epipolar geometry and fundamental matrix, Stereo vision (disparity maps, rectification), Structure from Motion (SfM) pipeline, SLAM fundamentals (visual odometry, bundle adjustment), Depth estimation from monocular cues.
CNN architectures for vision (AlexNet, VGG, ResNet, EfficientNet), Object detection (R-CNN family, YOLO, SSD, RetinaNet), Semantic segmentation (FCN, U-Net, DeepLab), Instance segmentation (Mask R-CNN), Visual transformers (ViT, Swin Transformer), Self-supervised learning (SimCLR, DINO), 3D vision with PointNet/PointNet++, Video analysis (optical flow, 3D CNNs).