Direkt zum Inhalt springen
Computer Vision Group
TUM Department of Informatics
Technical University of Munich

Technical University of Munich



DirectShape: Direct Photometric Alignment of Shape Priors

Contact: Rui Wang, Nan Yang, Jörg Stückler, Prof. Daniel Cremers

This page is still under construction. Stay tuned.


Scene understanding from images is a challenging problem which is encountered in autonomous driving. On the object level, while 2D methods have gradually evolved from computing simple bounding boxes to delivering finer grained results like instance segmentations, the 3D family is still dominated by estimating 3D bounding boxes. In this paper, we propose a novel approach to jointly infer the 3D rigid-body poses and shapes of vehicles from a stereo image pair using shape priors. Unlike previous works that geometrically align shapes to point clouds from dense stereo reconstruction, our approach works directly on images by combining a photometric and a silhouette alignment term in the energy function. An adaptive sparse point selection scheme is proposed to efficiently measure the consistency with both terms. In experiments, we show superior performance of our method on 3D pose and shape estimation over the previous geometric approach. Moreover, we demonstrate that our method can also be applied as a refinement step and significantly boost the performances of several state-of-the-art deep learning based 3D object detectors.


ICRA Presentation

The video is with audio.


If you find our work useful in your research, please consider citing:

  author={R. Wang and N. Yang and J. Stueckler and D. Cremers},
  title={DirectShape: Photometric Alignment of Shape Priors for Visual Vehicle Pose and Shape Estimation},
  booktitle={Proc. of the IEEE International Conference on Robotics and Automation (ICRA)},


  • ICRA 2020 paper: paper. Derivation of all the analytical Jacobians and more qualitative results are provided in: supplementary document. They are also available on arxiv.
  • Validation splits of KITTI Object 3D: val1.txt (used by Mono3D, 3DOP and MLF), val2.txt (used by Deep3DBox).
  • 3D pose evaluation results on KITTI Object 3D: tba
  • 3D shape evaluation results on KITTI Stereo 2015: tba
  • Please contact Rui Wang if you need anything further.


  • Shape variation by modifying shape coefficients with color coded signed distances to the surface:

  • Sample qualitative results (more can be found in the supplementary document above):


Export as PDF, XML, TEX or BIB

Conference and Workshop Papers
[]DirectShape: Photometric Alignment of Shape Priors for Visual Vehicle Pose and Shape Estimation (R. Wang, N. Yang, J. Stueckler and D. Cremers), In Proc. of the IEEE International Conference on Robotics and Automation (ICRA), 2020. ([video][presentation][project page][supplementary][arxiv]) [bibtex] [pdf]
Powered by bibtexbrowser
Export as PDF, XML, TEX or BIB

Rechte Seite

Informatik IX
Chair of Computer Vision & Artificial Intelligence

Boltzmannstrasse 3
85748 Garching info@vision.in.tum.de

Follow us on:
CVG Group DVL Group



Bernt Schiele (Max Planck Institute for Informatics) will give a talk in the TUM AI lecture series on June 10th, 3pm! Livestream

French-German Machine Learning Symposium

French-German Machine Learning Symposium

The French-German Machine Learning Symposium aims to strengthen interactions and inspire collaborations between both countries. We invited some of the leading ML researchers from France and Germany to this two-day symposium to give a glimpse into their research, and engage in discussions on the future of machine learning and how to strengthen research collaborations in ML between France and Germany.

The list of speakers includes Yann LeCun, Cordelia Schmid, Jean-Bernard Lasserre, Bernhard Schölkopf, and many more! For the full program please visit the webpage.


Ron Kimmel (Technion - Israel Institute of Technology) will give a talk in the TUM AI lecture series on May 6th, 3pm! Livestream


4Seasons Dataset: We have released a novel dataset for benchmarking multi-weather SLAM in autonomous driving.


Hao Li (Pinscreen) will give a talk in the TUM AI lecture series on April 22nd, 8pm! Livestream