Direkt zum Inhalt springen
Computer Vision Group
TUM Department of Informatics
Technical University of Munich

Technical University of Munich

Home Research Areas Deep Learning

Deep Learning

Contact: Dr. Laura Leal-Taixe, Vladimir Golkov, Tim Meinhardt, Qunjie Zhou, Patrick Dendorfer

Deep Learning is a powerful machine learning tool that showed outstanding performance in many fields. One of the greatest successes of Deep Learning has been achieved in large scale object recognition with Convolutional Neural Networks (CNNs). CNNs' main power comes from learning data representations directly from data in a hierarchical layer based structure.

We apply Convolutional Neural Networks in order to solve computer vision tasks such as optical flow, scene understanding, and develop state-of-the-art methods.

Learning by Association

A child is able to learn new concepts quickly and without the need for millions examples that are pointed out individually. Once a child has seen one dog, she or he will be able to recognize other dogs and becomes better at recognition with subsequent exposure to more variety. In terms of training computers to perform similar tasks, deep neural networks have demonstrated superior performance among machine learning models.

However, these networks have been trained dramatically differently than a learning child, requiring labels for every training example, following a purely supervised training scheme. Neural networks are defined by huge amounts of parameters to be optimized. Therefore, a plethora of labeled training data is required, which might be costly and time consuming to obtain. It is desirable to train machine learning models without labels (unsupervisedly) or with only some fraction of the data labeled (semi-supervisedly).

We propose a novel training method that follows an intuitive approach: learning by association. We feed a batch of labeled and a batch of unlabeled data through a network, producing embeddings for both batches. Then, an imaginary walker is sent from samples in the labeled batch to samples in the unlabeled batch. The transition follows a probability distribution obtained from the similarity of the respective embeddings which we refer to as an association.

In this line of work, we have published papers on semi-supervised training, domain adaptation, multimodal training (text and images) and unsupervised training / clustering. More information can be found here.

Deep Depth From Focus

DDFF aims at predicting a depth map from a given focal stack in which the focus of the camera gradually changes. DDFFNet is an end-to-end trained Convolutional Neural Network, designed to solve the highly ill-posed depth from focus task. Please visit the DDFF Project Page for details.


In our recent ICCV'15 paper, we presented two CNN architectures to estimate the optical flow given one image pair. We train the network end-to-end on a GPU. Our system works as good as state-of-the-art techniques.

Unsupervised Domain Adaptation for Vehicle Control

Even though end-to-end supervised learning has shown promising results for sensorimotor control of self-driving cars, its performance is greatly affected by the weather conditions under which it was trained, showing poor generalization to unseen conditions. Therefore, we show how knowledge can be transferred using semantic maps to new weather conditions without the need to obtain new ground truth data. To this end, we propose to divide the task of vehicle control into two independent modules: a control module which is only trained on one weather condition for which labeled steering data is available, and a perception module which is used as an interface between new weather conditions and the fixed control module. To generate the semantic data needed to train the perception module, we propose to use a generative adversarial network (GAN)-based model to retrieve the semantic information for the new conditions in an unsupervised manner. We introduce a master-servant architecture, where the master model (semantic labels available) trains the servant model (semantic labels not available)

Export as PDF, TEX or BIB

Sort Order:  by type by year
2020 2019 2018 2017 2016 2015 
[]Deep Learning for Virtual Screening: Five Reasons to Use ROC Cost Functions (V. Golkov, A. Becker, D. T. Plop, D. Čuturilo, N. Davoudi, J. Mendenhall, R. Moretti, J. Meiler and D. Cremers), In arXiv preprint arXiv:2007.07029, 2020.  [bibtex] [arXiv:2007.07029] [pdf]
Conference and Workshop Papers
[]D3VO: Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual Odometry (N. Yang, L. von Stumberg, R. Wang and D. Cremers), In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.  [bibtex] [arXiv:2003.01060] [pdf]Oral Presentation
2020 2019 2018 2017 2016 2015 
Journal Articles
[] Efficient Deep Network Architectures for Fast Chest X-Ray Tuberculosis Screening and Visualization (F. Pasa, V. Golkov, F. Pfeiffer, D. Cremers and D. Pfeiffer), In Scientific Reports, volume 9, 2019.  [bibtex] [pdf] [doi]
[]Deep Learning for 2D and 3D Rotatable Data: An Overview of Methods (L. Della Libera, V. Golkov, Y. Zhu, A. Mielke and D. Cremers), In arXiv preprint arXiv:1910.14594, 2019.  [bibtex] [arXiv:1910.14594] [pdf]
[]Learning to Evolve (J. Schuchardt, V. Golkov and D. Cremers), In arXiv preprint arXiv:1905.03389, 2019.  [bibtex] [arXiv:1905.03389] [pdf]
Conference and Workshop Papers
[]Negative-Unlabeled Learning for Diffusion MRI (P. Swazinna, V. Golkov, I. Lipp, E. Sgarlata, V. Tomassini, D. K. Jones and D. Cremers), In International Society for Magnetic Resonance in Medicine (ISMRM) Annual Meeting, 2019.  [bibtex] [pdf]
[]q-Space Novelty Detection with Variational Autoencoders (A. Vasilev, V. Golkov, M. Meissner, I. Lipp, E. Sgarlata, V. Tomassini, D. K. Jones and D. Cremers), In MICCAI 2019 International Workshop on Computational Diffusion MRI, 2019.  [bibtex] [arXiv:1806.02997] [pdf]Oral Presentation
2020 2019 2018 2017 2016 2015 
Journal Articles
[]What Makes Good Synthetic Training Data for Learning Disparity and Optical Flow Estimation? (Nikolaus Mayer, Eddy Ilg, Philipp Fischer, Caner Hazirbas, Daniel Cremers, Alexey Dosovitskiy and Thomas Brox), In , volume 41, 2018. (arxiv) [bibtex] [arXiv:1801.06397]
[]Clustering with Deep Learning: Taxonomy and New Methods (E. Aljalbout, V. Golkov, Y. Siddiqui, M. Strobel and D. Cremers), In arXiv preprint arXiv:1801.07648, 2018.  [bibtex] [arXiv:1801.07648]
Conference and Workshop Papers
[]Deep Virtual Stereo Odometry: Leveraging Deep Depth Prediction for Monocular Direct Sparse Odometry (N. Yang, R. Wang, J. Stueckler and D. Cremers), In European Conference on Computer Vision (ECCV), 2018. ([arxiv],[supplementary],[video],[talk],[project]) [bibtex]Oral Presentation
[]Associative Deep Clustering - Training a Classification Network with no Labels (P. Haeusser, J. Plapp, V. Golkov, E. Aljalbout and D. Cremers), In Proc. of the German Conference on Pattern Recognition (GCPR), 2018.  [bibtex] [pdf]
[]Precursor microRNA Identification Using Deep Convolutional Neural Networks (B. T. Do, V. Golkov, G. E. Gürel and D. Cremers), In bioRxiv preprint, 2018. (bioRxiv:414656) [bibtex] [pdf]
[]q-Space Deep Learning for Alzheimer's Disease Diagnosis: Global Prediction and Weakly-Supervised Localization (V. Golkov, P. Swazinna, M. M. Schmitt, Q. A. Khan, C. M. W. Tax, M. Serahlazau, F. Pasa, F. Pfeiffer, G. J. Biessels, A. Leemans and D. Cremers), In International Society for Magnetic Resonance in Medicine (ISMRM) Annual Meeting, 2018.  [bibtex] [pdf]
[]Deep Depth From Focus (C. Hazirbas, S. G. Soyer, M. C. Staab, L. Leal-Taixé and D. Cremers), In Asian Conference on Computer Vision (ACCV), 2018. ([arxiv], Deep Depth From Focus,[dataset]) [bibtex]
2020 2019 2018 2017 2016 2015 
[]Regularization for Deep Learning: A Taxonomy (J. Kukačka, V. Golkov and D. Cremers), In arXiv preprint arXiv:1710.10686, 2017.  [bibtex] [arXiv:1710.10686] [pdf]
[]3D Deep Learning for Biological Function Prediction from Physical Fields (V. Golkov, M. J. Skwark, A. Mirchev, G. Dikov, A. R. Geanes, J. Mendenhall, J. Meiler and D. Cremers), In arXiv preprint arXiv:1704.04039, 2017.  [bibtex] [arXiv:1704.04039] [pdf]
Conference and Workshop Papers
[]Associative Domain Adaptation (P. Haeusser, T. Frerix, A. Mordvintsev and D. Cremers), In IEEE International Conference on Computer Vision (ICCV), 2017. ([code] [PDF from CVF]) [bibtex] [pdf]
[]Better Text Understanding Through Image-To-Text Transfer (K. Kurach, S. Gelly, M. Jastrzebski, P. Haeusser, O. Teytaud, D. Vincent and O. Bousquet), In arxiv:1705.08386, 2017.  [bibtex] [pdf]
[]One-Shot Video Object Segmentation (S. Caelles, K.-K. Maninis, J. Pont-Tuset, L. Leal-Taixé, D. Cremers and L. Van Gool), In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.  [bibtex] [pdf]
[]Learning Proximal Operators: Using Denoising Networks for Regularizing Inverse Imaging Problems (T. Meinhardt, M. Moeller, C. Hazirbas and D. Cremers), In IEEE International Conference on Computer Vision (ICCV), 2017. ([arxiv], [code]) [bibtex]
[]Learning by Association - A versatile semi-supervised training method for neural networks (P. Haeusser, A. Mordvintsev and D. Cremers), In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. ([code] [PDF from CVF]) [bibtex] [pdf]
[]Establishment of an interdisciplinary workflow of machine learning-based Radiomics in sarcoma patients (J.C. Peeken, C. Knie, V. Golkov, K. Kessel, F. Pasa, Q. Khan, M. Seroglazov, J. Kukačka, T. Goldberg, L. Richter, J. Reeb, B. Rost, F. Pfeiffer, D. Cremers, F. Nüsslin and S.E. Combs), In 23. Jahrestagung der Deutschen Gesellschaft für Radioonkologie (DEGRO), 2017.  [bibtex]
[]Image-based localization using LSTMs for structured feature correlation (F. Walch, C. Hazirbas, L. Leal-Taixé, T. Sattler, S. Hilsenbeck and D. Cremers), In IEEE International Conference on Computer Vision (ICCV), 2017. ([arxiv]) [bibtex]
2020 2019 2018 2017 2016 2015 
Journal Articles
[]q-Space Deep Learning: Twelve-Fold Shorter and Model-Free Diffusion MRI Scans (V. Golkov, A. Dosovitskiy, J. I. Sperl, M. I. Menzel, M. Czisch, P. Sämann, T. Brox and D. Cremers), In IEEE Transactions on Medical Imaging, volume 35, 2016. Special Issue on Deep Learning [bibtex] [pdf]Special Issue on Deep Learning
Conference and Workshop Papers
[]FuseNet: Incorporating Depth into Semantic Segmentation via Fusion-based CNN Architecture (C. Hazirbas, L. Ma, C. Domokos and D. Cremers), In Asian Conference on Computer Vision, 2016. ([code]) [bibtex] [pdf]
[]Protein Contact Prediction from Amino Acid Co-Evolution Using Convolutional Networks for Graph-Valued Images (V. Golkov, M. J. Skwark, A. Golkov, A. Dosovitskiy, T. Brox, J. Meiler and D. Cremers), In Annual Conference on Neural Information Processing Systems (NIPS), 2016. ([video]) [bibtex] [pdf]Oral Presentation (acceptance rate: under 2%)
2020 2019 2018 2017 2016 2015 
Conference and Workshop Papers
[]CAPTCHA Recognition with Active Deep Learning (F. Stark, C. Hazirbas, R. Triebel and D. Cremers), In GCPR Workshop on New Challenges in Neural Computation, 2015. ([code]) [bibtex] [pdf]
[]FlowNet: Learning Optical Flow with Convolutional Networks (A. Dosovitskiy, P. Fischer, E. Ilg, P. Haeusser, C. Hazirbas, V. Golkov, P. van der Smagt, D. Cremers and T. Brox), In IEEE International Conference on Computer Vision (ICCV), 2015. ([video],[code]) [bibtex] [doi] [pdf]
[]q-Space Deep Learning for Twelve-Fold Shorter and Model-Free Diffusion MRI Scans (V. Golkov, A. Dosovitskiy, P. Sämann, J. I. Sperl, T. Sprenger, M. Czisch, M. I. Menzel, P. A. Gómez, A. Haase, T. Brox and D. Cremers), In Medical Image Computing and Computer Assisted Intervention (MICCAI), 2015.  [bibtex] [pdf]
Powered by bibtexbrowser
Export as PDF, TEX or BIB

Rechte Seite

Informatik IX
Chair of Computer Vision & Artificial Intelligence

Boltzmannstrasse 3
85748 Garching