Humans effortlessly perceive the 3D world through their visual system, a remarkable ability that allows for the recognition of objects, people, emotions, and the accurate perception of space. In contrast, this remains a challenging study area for psychologists and computer vision researchers, who strive to model how we interpret 3D scenes from 2D images. The complexity of this task lies in its nature as an ill-posed inverse problem, where incomplete information is used to recover unknowns. Researchers address this using probabilistic models, machine learning, and physics-based approaches to replicate human vision. However, current artificial systems still lag, particularly in their ability to generalize across different settings and tasks. This thesis focuses on one of the several tasks in Computer Vision, investigating the 6-Dimensional Pose Estimation of rigid objects and testing the generalization capabilities to different scenarios and applications. Although considerable progress has been made in this field, several formidable challenges remain, including domain shift, occlusions, symmetries, novel objects and reflective or transparent materials. One main challenge is bringing this research to work in real-world robotic applications effectively. Robotics requires working in challenging environments and often literature datasets do not reflect the complexity of the real world. This thesis develops robust methodologies that perform effectively across different scenarios and real-world applications. The focus is on using synthetic datasets and designing pipelines tailored to address the unique challenges posed by specific settings and objects. The ultimate goal is to bring reliable, adaptable computer vision systems closer to real-world robotic applications, enhancing robotic perception.
Towards Robust 6D Pose Estimation: Investigating Synthetic Data, Challenging Materials and Complex Robotic Environments / Govi, E.. - (2025).
Towards Robust 6D Pose Estimation: Investigating Synthetic Data, Challenging Materials and Complex Robotic Environments
GOVI, ELENA
2025-01-01
Abstract
Humans effortlessly perceive the 3D world through their visual system, a remarkable ability that allows for the recognition of objects, people, emotions, and the accurate perception of space. In contrast, this remains a challenging study area for psychologists and computer vision researchers, who strive to model how we interpret 3D scenes from 2D images. The complexity of this task lies in its nature as an ill-posed inverse problem, where incomplete information is used to recover unknowns. Researchers address this using probabilistic models, machine learning, and physics-based approaches to replicate human vision. However, current artificial systems still lag, particularly in their ability to generalize across different settings and tasks. This thesis focuses on one of the several tasks in Computer Vision, investigating the 6-Dimensional Pose Estimation of rigid objects and testing the generalization capabilities to different scenarios and applications. Although considerable progress has been made in this field, several formidable challenges remain, including domain shift, occlusions, symmetries, novel objects and reflective or transparent materials. One main challenge is bringing this research to work in real-world robotic applications effectively. Robotics requires working in challenging environments and often literature datasets do not reflect the complexity of the real world. This thesis develops robust methodologies that perform effectively across different scenarios and real-world applications. The focus is on using synthetic datasets and designing pipelines tailored to address the unique challenges posed by specific settings and objects. The ultimate goal is to bring reliable, adaptable computer vision systems closer to real-world robotic applications, enhancing robotic perception.| File | Dimensione | Formato | |
|---|---|---|---|
|
EGOVI_Doctoral_Thesis_corrected_pdfa.pdf
embargo fino al 01/04/2027
Licenza:
Creative commons
Dimensione
55.62 MB
Formato
Adobe PDF
|
55.62 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


