Jason Rambach, Chengbiao Deng, Alain Pagani, and Didier Stricker.
Learning 6dof object poses from synthetic single channel images.
In Adjunct Proceedings of the IEEE International Symposium for Mixed and Augmented Reality 2018 (To appear). 2018.
[BibTeX▼]
Estimation of 6DoF object poses from single images is a problem of great interest in augmented reality and robotics research since it enables interaction with the object or initialization of pose tracking. Approaches utilizing deep neural networks have shown good performance, however the majority of them rely on training on real images of the objects which can be challenging in terms of ground truth pose acquisition, scalability and full coverage of possible poses. In this paper, we disregard all depth and color information and train a CNN to directly regress 6DoF object poses using only synthetic single channel edge enhanced images. We evaluate our approach against the state-of-the-art using synthetic training images and show a significant improvement on the commonly used LINEMOD benchmark dataset.