Direkt zum Inhalt springen
Computer Vision Group
TUM Department of Informatics
Technical University of Munich

Technical University of Munich

Menu

Links


Deep Virtual Stereo Odometry: Leveraging Deep Depth Prediction for Monocular Direct Sparse Odometry

Contact: Nan Yang, Rui Wang, Jörg Stückler, Prof. Daniel Cremers

ECCV 2018 oral presentation

Featured in Computer Vision News' Best of ECCV:


Abstract

Monocular visual odometry approaches that purely rely on geometric cues are prone to scale drift and require sufficient motion parallax in successive frames for motion estimation and 3D reconstruction. In this paper, we propose to leverage deep monocular depth prediction to overcome limitations of geometry-based monocular visual odometry. To this end, we incorporate deep depth predictions into DSO as direct virtual stereo measurements. For depth prediction, we design a novel deep network that refines predicted depth from a single image in a two-stage process. We train our network in a semi-supervised way on photoconsistency in stereo images and on consistency with accurate sparse depth reconstructions from Stereo DSO. Our deep predictions excel state-of-the-art approaches for monocular depth on the KITTI benchmark. Moreover, our Deep Virtual Stereo Odometry clearly exceeds previous monocular and deep-learning based methods in accuracy. It even achieves comparable performance to the state-of-the-art stereo methods, while only relying on a single camera.

Semi-Supervised Deep Monocular Depth Estimation

We propose a semi-supervised approach to deep monocular depth estimation. It builds on three key ingredients: self-supervised learning from photoconsistency in a stereo setup, supervised learning based on accurate sparse depth reconstruction by Stereo DSO, and StackNet, a two-stage network with a stacked encoder-decoder architecture.

Deep Virtual Stereo Odometry

Deep Virtual Stereo Odometry (DVSO) builds on the windowed sparse direct bundle adjustment formulation of monocular DSO. We use our disparity predictions for DSO in two key ways: Firstly, we initialize depth maps of new keyframes from the disparities. Beyond this rather straightforward approach, we also incorporate virtual direct image alignment constraints into the windowed direct bundle adjustment of DSO. We obtain these constraints by warping images with the estimated depth by bundle adjustment and the predicted right disparities by our network assuming a virtual stereo setup.

Results

We quantitatively evaluate our StackNet with other state-of-the-art monocular depth prediction methods on the publicly available KITTI dataset. For DVSO, we evaluate its tracking accuracy on the KITTI odometry benchmark with other state-of-the-art monocular as well as stereo visual odometry systems. In the supplementary material, we also show the generalization ability of StackNet as well as DVSO.

Monocular Depth Estimation

Monocular Visual Odometry

Downloads

Trajectories of DVSO on KITTI 00-10: dvso_kitti_00_10.zip

Publications


Export as PDF, XML, TEX or BIB

Journal Articles
2018
[]Direct Sparse Odometry (J. Engel, V. Koltun and D. Cremers), In IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018.  [bibtex] [pdf]
Conference and Workshop Papers
2021
[]MonoRec: Semi-Supervised Dense Reconstruction in Dynamic Environments from a Single Moving Camera (F. Wimbauer, N. Yang, L. von Stumberg, N. Zeller and D Cremers), In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021. ([project page]) [bibtex] [arXiv:2011.11814]
2020
[]D3VO: Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual Odometry (N. Yang, L. von Stumberg, R. Wang and D. Cremers), In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.  [bibtex] [arXiv:2003.01060] [pdf]Oral Presentation
2018
[]Deep Virtual Stereo Odometry: Leveraging Deep Depth Prediction for Monocular Direct Sparse Odometry (N. Yang, R. Wang, J. Stueckler and D. Cremers), In European Conference on Computer Vision (ECCV), 2018. ([arxiv],[supplementary],[project]) [bibtex]Oral Presentation
2017
[]Stereo DSO: Large-Scale Direct Sparse Visual Odometry with Stereo Cameras (R. Wang, M. Schwörer and D. Cremers), In International Conference on Computer Vision (ICCV), 2017. ([supplementary][video][arxiv][project]) [bibtex] [pdf]
Powered by bibtexbrowser
Export as PDF, XML, TEX or BIB

Rechte Seite

Informatik IX
Computer Vision Group

Boltzmannstrasse 3
85748 Garching info@vision.in.tum.de

Follow us on:

News

17.07.2022

MCML Kick-Off

On July 27th, we are organizing the Kick-Off of the Munich Center for Machine Learning in the Bavarian Academy of Sciences.

17.07.2022

AI Symposium

On July 22nd 2022, we are organizing a Symposium on AI within the Technology Forum of the Bavarian Academy of Sciences.

05.07.2022

We are organizing a workshop on Map-Based Localization for Autonomous Driving at ECCV 2022, Tel Aviv, Israel.

03.04.2022

In April 2022 Jürgen Sturm and Daniel Cremers were featured among the top 6 most influential scholars in robotics of the last decade.

31.03.2022

We have open PhD and postdoc positions! To apply, please use our application form.

More