Weakly Supervised Learning for Depth Estimation in Monocular Images
Final Report Abstract
In this project, we developed novel approaches for depth estimation in monocular images. Two types of machine learning methods that merely require “weak” supervision were researched, namely learning to rank and superset learning. Furthermore, we investigated computational models for the construction of monocular depth features, motivated by human visual perception. In terms of learning-to-rank approaches, we proposed to treat the problem of depth estimation as a listwise ranking problem, leveraging the well-known Plackett-Luce probability distribution on rankings. We proposed a neural network architecture to learn the parameters of the distribution over depth rankings and realized an efficient and cost-effective way of training listwise depth ranking models. Additionally, we showed that the model allows for the estimation of shift-invariant metric depth information from ranking-only data provided at training time. To construct (interpretable) features for monocular depth estimation, we modeled and implemented four monocular criteria (linear perspective, occlusion, relative height, usual size) that are relevant for both indoor and outdoor images. We analyzed to what extent these features are implicitly learned by a data-driven deep neural network. In follow-up work, we investigated whether we can detect and correct ranking errors of a state-of-the-art model using the hand-crafted features. We suggested to utilize cross-attention in a transformer decoder to learn spatial relations from two image patches by exploiting patch-wise image context. Our experiments showed that the model can predict and correct a subset of the errors made by a state-of-the-art approach. Other project publications leveraged the concept of superset learning. Under the notion of label relaxation, we suggested the weakening of label information in a general framework, exemplified for probabilistic classification. Here, superset labels are composed of multiple candidate probability distributions, forming credal sets, whose mathematical properties are exploited to obtain an efficient and robust learning methodology against distortions from inaccurate data. In another project publication, we transfer this idea to the field of monocular depth estimation. Instead of taking sensor signals as exact measurements, we follow the idea of label relaxation by (fuzzy) supersets around the originally observed depth value. Together with generalized empirical risk minimization, this model leads to more robust and better generalizing depth regression models. Furthermore, we extended the scope of label re-modeling to the paradigm of semi-supervised learning. The richer form of supervision by (fuzzy) supersets was leveraged in a credal self-supervised learning approach. Instead of using single (precise) probabilistic distributions as pseudo-labels, credal sets are constructed by a self-learner. Combined with generalized empirical risk minimization, this method leads to a more cautious yet robust learning behavior. Moreover, we applied a similar idea to the problem of semi-supervised monocular depth estimation.
Publications
-
Credal self-supervised learning. In Proceedlins NeurIPS, Advances in Neural Information Processing Systems (Vol. 34, pp. 14370-14382)
Lienen, J. & Hüllermeier, E.
-
From Label Smoothing to Label Relaxation. Proceedings of the AAAI Conference on Artificial Intelligence, 35(10), 8583-8591.
Lienen, Julian & Hüllermeier, Eyke
-
Instance weighting through data imprecisiation. International Journal of Approximate Reasoning, 134, 1-14.
Lienen, Julian & Hüllermeier, Eyke
-
Monocular Depth Estimation via Listwise Ranking using the Plackett-Luce Model. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 14590-14599. IEEE.
Lienen, Julian; Hullermeier, Eyke; Ewerth, Ralph & Nommensen, Nils
-
Robust Regression for Monocular Depth Estimation. In Asian Conference on Machine Learning (pp. 1001-1016)
Lienen, J., Nommensen, N., Ewerth, R. & Hüllermeier, E.
-
Scikit-Weak: A Python Library for Weakly Supervised Machine Learning. Lecture Notes in Computer Science, 57-70. Springer Nature Switzerland.
Campagner, Andrea; Lienen, Julian; Hüllermeier, Eyke & Ciucci, Davide
-
Analyzing Results of Depth Estimation Models with Monocular Criteria. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 3739-3743. IEEE.
Theiner, Jonas; Nommensen, Nils; Rhotert, Jim; Springstein, Matthias; Müller-Budack, Eric & Ewerth, Ralph
-
Conformal Credal Self-Supervised Learning. Proc. COPA, Conformal and Probabilistic Prediction with Applications. Limassol, Cyprus. PMLR 204:1-20, 2023
Lienen, J.; Demir, C. & Hüllermeier, E.
-
Memorization-Dilation: Modeling Neural Collapse Under Noise. Proc. ICLR, 11th International Conference on Learning Representations. Kigali, Rwanda,
Nguyen, D.A. , Levie, R. , Lienen, J. , Hüllermeier, E. & Kutyniok, G.
-
Mitigating Label Noise through Data Ambiguation. Proceedings of the AAAI Conference on Artificial Intelligence, 38(12), 13799-13807.
Lienen, Julian & Hüllermeier, Eyke
