CS342: Computer Vision and Pattern Recognition

Course Description

This course provides a comprehensive introduction to the principles and applications of computer vision, covering both classical techniques and modern deep learning-based approaches. Core topics include image formation, geometric transformations, and feature detection. Students will explore advanced topics like Structure-from-Motion (SfM), neural networks, and modern applications such as object detection, semantic segmentation, and computational photography. A final project will allow students to apply their knowledge to real-world problems.

Prerequisite(s)

A solid understanding of probability, linear algebra, data structures, and algorithms. Prior experience in image processing is helpful but not required, as foundational concepts will be introduced.

Textbook

Computer Vision: Algorithms and Applications (Second Edition), Richard Szeliski.

Additional References:

Digital Image Processing by Rafael Gonzalez and Richard Woods
Computer Vision: A Modern Approach by David Forsyth and Jean Ponce
Generative Deep Learning by David Foster
Multiple View Geometry in Computer Vision by Richard Hartley and Andrew Zisserman

Course Objectives

To explore the essential concepts and foundational techniques of computer vision, such as image formation, geometric transformations, and photometric properties.
To study classical techniques like feature detection, image alignment, and 3D reconstruction.
To introduce modern deep learning-based methods, including object detection, semantic segmentation, and depth estimation.
To provide practical experience with tools like OpenCV and PyTorch, bridging theory and application.
To develop critical analysis skills for applying computer vision techniques in diverse fields like robotics, AR/VR, and medical imaging.

Course Outcomes

Describe and analyze the mathematical and algorithmic foundations of computer vision, including image transformations and multi-view geometry.
Implement classical and modern techniques for image processing, feature extraction, and motion analysis.
Apply advanced deep learning models to solve challenges such as object detection, segmentation, and depth estimation.
Design and evaluate computer vision systems for practical applications in fields like robotics and medical imaging.
Critically assess the limitations and performance of computer vision algorithms and models.

Course Outline

Introduction to Computer Vision: Overview, applications, and challenges.
Image Formation and Geometric Transformations: 2D/3D transformations, projections, and the digital camera.
Image Processing Fundamentals: Pixel transforms, histogram equalization, and filtering.
Feature Detection and Matching: Harris corners, SIFT, ORB, and feature tracking.
Model Fitting and Optimization: Robust fitting, RANSAC, and Markov random fields.
Image Alignment and Stitching: Pairwise alignment, motion models, and stitching.
Deep Learning for Vision: Neural networks, CNNs, and vision transformers.
Motion Estimation: Optical flow, video stabilization, and frame interpolation.
3D Reconstruction and SLAM: SfM, SLAM, and multi-view reconstruction.
Recognition and Segmentation: Object detection, semantic segmentation, and instance segmentation.
Advanced Topics: HDR imaging, NeRF, and GANs.

Grade Distribution

Quizzes: 20%
Projects and/or Assignments: 30%
Midterm Exam: 20%
Final Exam: 30%