Deep Learning is a powerful machine learning tool that showed outstanding performance in many fields. One of the greatest successes of Deep Learning has been achieved in large scale object recognition with Convolutional Neural Networks (CNNs). CNNs' main power comes from learning data representations directly from data in a hierarchical layer based structure.
We apply Convolutional Neural Networks in order to solve computer vision tasks such as optical flow, scene understanding, and develop state-of-the-art methods.
Learning by Association
A child is able to learn new concepts quickly and without the need for millions examples that are pointed out individually. Once a child has seen one dog, she or he will be able to recognize other dogs and becomes better at recognition with subsequent exposure to more variety. In terms of training computers to perform similar tasks, deep neural networks have demonstrated superior performance among machine learning models.
However, these networks have been trained dramatically differently than a learning child, requiring labels for every training example, following a purely supervised training scheme. Neural networks are defined by huge amounts of parameters to be optimized. Therefore, a plethora of labeled training data is required, which might be costly and time consuming to obtain. It is desirable to train machine learning models without labels (unsupervisedly) or with only some fraction of the data labeled (semi-supervisedly).
We propose a novel training method that follows an intuitive approach: learning by association. We feed a batch of labeled and a batch of unlabeled data through a network, producing embeddings for both batches. Then, an imaginary walker is sent from samples in the labeled batch to samples in the unlabeled batch. The transition follows a probability distribution obtained from the similarity of the respective embeddings which we refer to as an association.
In this line of work, we have published papers on semi-supervised training, domain adaptation, multimodal training (text and images) and unsupervised training / clustering. More information can be found here.
Deep Depth From Focus
DDFF aims at predicting a depth map from a given focal stack in which the focus of the camera gradually changes. DDFFNet is an end-to-end trained Convolutional Neural Network, designed to solve the highly ill-posed depth from focus task. Please visit the DDFF Project Page for details.
In our recent ICCV'15 paper, we presented two CNN architectures to estimate the optical flow given one image pair. We train the network end-to-end on a GPU. Our system works as good as state-of-the-art techniques.
Unsupervised Domain Adaptation for Vehicle Control
Even though end-to-end supervised learning has shown promising results for sensorimotor control of self-driving cars, its performance is greatly affected by the weather conditions under which it was trained, showing poor generalization to unseen conditions. Therefore, we show how knowledge can be transferred using semantic maps to new weather conditions without the need to obtain new ground truth data. To this end, we propose to divide the task of vehicle control into two independent modules: a control module which is only trained on one weather condition for which labeled steering data is available, and a perception module which is used as an interface between new weather conditions and the fixed control module. To generate the semantic data needed to train the perception module, we propose to use a generative adversarial network (GAN)-based model to retrieve the semantic information for the new conditions in an unsupervised manner. We introduce a master-servant architecture, where the master model (semantic labels available) trains the servant model (semantic labels not available)
|q-Space Deep Learning: Twelve-Fold Shorter and Model-Free Diffusion MRI Scans , In IEEE Transactions on Medical Imaging, volume 35, 2016. [bib] [pdf]Special Issue on Deep Learning|
|Conference and Workshop Papers|
|Deep Virtual Stereo Odometry: Leveraging Deep Depth Prediction for Monocular Direct Sparse Odometry , In European Conference on Computer Vision (ECCV), 2018.([arxiv],[supplementary],[video]) [bib]Oral Presentation|
|q-Space Novelty Detection with Variational Autoencoders , In ArXiv preprint, 2018.(arXiv:1806.02997) [bib] [pdf]|
|What Makes Good Synthetic Training Data for Learning Disparity and Optical Flow Estimation? , In International Journal of Computer Vision, 2018.(arxiv) [bib]|
|Associative Deep Clustering - Training a Classification Network with no Labels , In Proc. of the German Conference on Pattern Recognition (GCPR), 2018. [bib] [pdf]|
|Clustering with Deep Learning: Taxonomy and New Methods , In ArXiv preprint, 2018.(arXiv:1801.07648) [bib]|
|Precursor microRNA Identification Using Deep Convolutional Neural Networks , In bioRxiv preprint, 2018.(bioRxiv:414656) [bib] [pdf]|
|q-Space Deep Learning for Alzheimer's Disease Diagnosis: Global Prediction and Weakly-Supervised Localization , In International Society for Magnetic Resonance in Medicine (ISMRM) Annual Meeting, 2018. [bib] [pdf]|
|Deep Depth From Focus , In Asian Conference on Computer Vision (ACCV), 2018.([arxiv], [dataset]) [bib]|
|Regularization for Deep Learning: A Taxonomy , In ArXiv preprint, 2017.(arXiv:1710.10686) [bib] [pdf]|
|Associative Domain Adaptation , In IEEE International Conference on Computer Vision (ICCV), 2017.([code] [PDF from CVF]) [bib] [pdf]|
|Better Text Understanding Through Image-To-Text Transfer , In arxiv:1705.08386, 2017. [bib] [pdf]|
|One-Shot Video Object Segmentation , In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. [bib] [pdf]|
|Learning Proximal Operators: Using Denoising Networks for Regularizing Inverse Imaging Problems , In IEEE International Conference on Computer Vision (ICCV), 2017.([arxiv], [code]) [bib]|
|Learning by Association - A versatile semi-supervised training method for neural networks , In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.([code] [PDF from CVF]) [bib] [pdf]|
|3D Deep Learning for Biological Function Prediction from Physical Fields , In ArXiv preprint, 2017.(arXiv:1704.04039) [bib] [pdf]|
|Establishment of an interdisciplinary workflow of machine learning-based Radiomics in sarcoma patients , In 23. Jahrestagung der Deutschen Gesellschaft für Radioonkologie (DEGRO), 2017. [bib]|
|Image-based localization using LSTMs for structured feature correlation , In IEEE International Conference on Computer Vision (ICCV), 2017.([arxiv]) [bib]|
|FuseNet: Incorporating Depth into Semantic Segmentation via Fusion-based CNN Architecture , In Asian Conference on Computer Vision, 2016.([code]) [bib] [pdf]|
|Protein Contact Prediction from Amino Acid Co-Evolution Using Convolutional Networks for Graph-Valued Images , In Annual Conference on Neural Information Processing Systems (NIPS), 2016.([video]) [bib] [pdf]Oral Presentation (acceptance rate: under 2%)|
|A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation , In IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2016.(arXiv:1512.02134) [bib] [pdf]|
|CAPTCHA Recognition with Active Deep Learning , In GCPR Workshop on New Challenges in Neural Computation, 2015.([code]) [bib] [pdf]|
|FlowNet: Learning Optical Flow with Convolutional Networks , In IEEE International Conference on Computer Vision (ICCV), 2015.([video],[code]) [bib] [pdf] [doi]|
|q-Space Deep Learning for Twelve-Fold Shorter and Model-Free Diffusion MRI Scans , In Medical Image Computing and Computer Assisted Intervention (MICCAI), 2015. [bib] [pdf]|