DESCRIPTION
Our dream to make machines sense and perceive (notably see) comes true: nowadays Computer Vision enables diverse applications:
- Autonomous Systems (cars, drones, vessels) Perception,
- Robotics Perception and Control,
- Intelligent Human-Machine Interaction,
- Anthropocentric (human-centered)Computing,
- Smart Cities/Buildings and Assisted living.
Computer Vision, coupled with AI (notably Machine Learning and Deep Neural Network) advances hit the news almost every day.
This CVML Web Module focuses on Computer Vision and its applications in the above-mentioned diverse domains and new challenges ahead. First an introduction to computer vision is made, to be complemented with a formal presentation of digital images and videos, image/video sampling and color theory. After reviewing image acquisition, camera structure, camera geometry (mapping the 3D world on a 2D image plane) and camera calibration are presented. Stereo and multi-view imaging systems are presented for recovering 3D world geometry from 2D images. This is complemented by Structure from Motion (SfM) towards Simultaneous Localization and Mapping (SLAM) for vehicle and/or target localization. Then semantic 3D world mapping is overviewed, coupling 3D geometry and semantics. 3D object/target localization is then presented, encompassing Visual 3D object localization using 3D maps, GPS object localization, multisensor object localization and multi-view object localization. Object pose is then defined and its estimation by deep neural regression is presented. Computational Cinematography is a new topic in computer vision, encompassing visual shot framing types and shot feasibility issues, under shooting constraints.
Monocular depth estimation.
LECTURE LIST
- 3D Object Localization
- Computational Cinematography
- Digital Images and Videos
- Image acquisition. Camera geometry
- Introduction to Computer Vision
- Neural Semantic 3D World Modeling and Mapping
- Object Pose Estimation
- Semantic 3D World Mapping
- Simultaneous Localization and Mapping
- Stereo and Multiview Imaging
- Structure from Motion
- Attention and Transformer Networks in Computer Vision