Humans effortlessly perceive the 3D world through their visual system, a remarkable ability that allows for the recognition of objects, people, emotions, and the accurate perception of space. In contrast, this remains a challenging study area for psychologists and computer vision researchers, who strive to model how we interpret 3D scenes from 2D images. The complexity of this task lies in its nature as an ill-posed inverse problem, where incomplete information is used to recover unknowns. Researchers address this using probabilistic models, machine learning, and physics-based approaches to replicate human vision. However, current artificial systems still lag, particularly in their ability to generalize across different settings and tasks. This thesis focuses on one of the several tasks in Computer Vision, investigating the 6-Dimensional Pose Estimation of rigid objects and testing the generalization capabilities to different scenarios and applications. Although considerable progress has been made in this field, several formidable challenges remain, including domain shift, occlusions, symmetries, novel objects and reflective or transparent materials. One main challenge is bringing this research to work in real-world robotic applications effectively. Robotics requires working in challenging environments and often literature datasets do not reflect the complexity of the real world. This thesis develops robust methodologies that perform effectively across different scenarios and real-world applications. The focus is on using synthetic datasets and designing pipelines tailored to address the unique challenges posed by specific settings and objects. The ultimate goal is to bring reliable, adaptable computer vision systems closer to real-world robotic applications, enhancing robotic perception.

Towards Robust 6D Pose Estimation: Investigating Synthetic Data, Challenging Materials and Complex Robotic Environments / Govi, E.. - (2025).

Towards Robust 6D Pose Estimation: Investigating Synthetic Data, Challenging Materials and Complex Robotic Environments

GOVI, ELENA
2025-01-01

Abstract

Humans effortlessly perceive the 3D world through their visual system, a remarkable ability that allows for the recognition of objects, people, emotions, and the accurate perception of space. In contrast, this remains a challenging study area for psychologists and computer vision researchers, who strive to model how we interpret 3D scenes from 2D images. The complexity of this task lies in its nature as an ill-posed inverse problem, where incomplete information is used to recover unknowns. Researchers address this using probabilistic models, machine learning, and physics-based approaches to replicate human vision. However, current artificial systems still lag, particularly in their ability to generalize across different settings and tasks. This thesis focuses on one of the several tasks in Computer Vision, investigating the 6-Dimensional Pose Estimation of rigid objects and testing the generalization capabilities to different scenarios and applications. Although considerable progress has been made in this field, several formidable challenges remain, including domain shift, occlusions, symmetries, novel objects and reflective or transparent materials. One main challenge is bringing this research to work in real-world robotic applications effectively. Robotics requires working in challenging environments and often literature datasets do not reflect the complexity of the real world. This thesis develops robust methodologies that perform effectively across different scenarios and real-world applications. The focus is on using synthetic datasets and designing pipelines tailored to address the unique challenges posed by specific settings and objects. The ultimate goal is to bring reliable, adaptable computer vision systems closer to real-world robotic applications, enhancing robotic perception.
2025
Matematica
Artificial Intelligence
Robotic Vision
6D pose estimation
Deep Learning
Bertogna, Marko
Franchini, Giorgia
File in questo prodotto:
File Dimensione Formato  
EGOVI_Doctoral_Thesis_corrected_pdfa.pdf

embargo fino al 01/04/2027

Licenza: Creative commons
Dimensione 55.62 MB
Formato Adobe PDF
55.62 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/1889/6136
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact