Course Description
This course provides a comprehensive introduction to the principles and applications of computer vision, covering both classical techniques and modern deep learning-based approaches. Core topics include image formation, geometric transformations, and feature detection. Students will explore advanced topics like Structure-from-Motion (SfM), neural networks, and modern applications such as object detection, semantic segmentation, and computational photography. A final project will allow students to apply their knowledge to real-world problems.
Prerequisite(s)
A solid understanding of probability, linear algebra, data structures, and algorithms. Prior experience in image processing is helpful but not required, as foundational concepts will be introduced.
Textbook
Additional References:
- Digital Image Processing by Rafael Gonzalez and Richard Woods
- Computer Vision: A Modern Approach by David Forsyth and Jean Ponce
- Generative Deep Learning by David Foster
- Multiple View Geometry in Computer Vision by Richard Hartley and Andrew Zisserman
Course Objectives
- To explore the essential concepts and foundational techniques of computer vision, such as image formation, geometric transformations, and photometric properties.
- To study classical techniques like feature detection, image alignment, and 3D reconstruction.
- To introduce modern deep learning-based methods, including object detection, semantic segmentation, and depth estimation.
- To provide practical experience with tools like OpenCV and PyTorch, bridging theory and application.
- To develop critical analysis skills for applying computer vision techniques in diverse fields like robotics, AR/VR, and medical imaging.
Course Outcomes
- Describe and analyze the mathematical and algorithmic foundations of computer vision, including image transformations and multi-view geometry.
- Implement classical and modern techniques for image processing, feature extraction, and motion analysis.
- Apply advanced deep learning models to solve challenges such as object detection, segmentation, and depth estimation.
- Design and evaluate computer vision systems for practical applications in fields like robotics and medical imaging.
- Critically assess the limitations and performance of computer vision algorithms and models.
Course Outline
- Introduction to Computer Vision: Overview, applications, and challenges.
- Image Formation and Geometric Transformations: 2D/3D transformations, projections, and the digital camera.
- Image Processing Fundamentals: Pixel transforms, histogram equalization, and filtering.
- Feature Detection and Matching: Harris corners, SIFT, ORB, and feature tracking.
- Model Fitting and Optimization: Robust fitting, RANSAC, and Markov random fields.
- Image Alignment and Stitching: Pairwise alignment, motion models, and stitching.
- Deep Learning for Vision: Neural networks, CNNs, and vision transformers.
- Motion Estimation: Optical flow, video stabilization, and frame interpolation.
- 3D Reconstruction and SLAM: SfM, SLAM, and multi-view reconstruction.
- Recognition and Segmentation: Object detection, semantic segmentation, and instance segmentation.
- Advanced Topics: HDR imaging, NeRF, and GANs.
Grade Distribution
- Quizzes: 20%
- Projects and/or Assignments: 30%
- Midterm Exam: 20%
- Final Exam: 30%