Detailseite
Projekt Druckansicht

Videosegmentierung von multiplen Repräsentierungen mit Lifted Multicuts

Fachliche Zuordnung Bild- und Sprachverarbeitung, Computergraphik und Visualisierung, Human Computer Interaction, Ubiquitous und Wearable Computing
Förderung Förderung von 2017 bis 2022
Projektkennung Deutsche Forschungsgemeinschaft (DFG) - Projektnummer 360826079
 
Erstellungsjahr 2022

Zusammenfassung der Projektergebnisse

The goal of this project has been to address the task of temporally consistent video segmentation based on graphical models, specifically minimum cost multicuts (MC) and minimum cost lifted muliticuts (LMC). The MC formulation allows to define attractive and repulsive terms and thus facilitate even the segmentation of small objects. They have proven very powerful in the context of motion segmentation even when using only low-level cues such as pairwise motion cues and color. Lifted multicuts build on the definition of the standard MC and allow to additionally define long range dependencies without changing the set of feasible solutions. In this project, we have extended and applied the MC and LMC framework in the context of video and motion segmentation to leverage its flexibility. We envisioned to investigate the inclusion of a variety of segmentation cues such as point trajectories, motion boundaries and occlusion regions and produce video and motion segmentations using a joint optimization approach. The first contributions we made w.r.t. these goals were made while continuing previously existing research directions and collaborations. Specifically, we have continued to work on a joint model for motion segmentation and multiple object tracking which was published as a technical report by the time of proposal writing. Therein, we built a joint MC based on point trajectories which are connected via color and motion cues as well as object detections represented by bounding boxes and potentially their respective object segmentation masks. It also was the winning approach of the Multiple Object Tracking Challenge held in the CVPR 2017 workshop on Multiple Object Tracking. Similarly, we proposed a first extension of our efficient LMC solver to facilitate the definition of higher order edges and higher order (lifted) edges. This allows the representation of motion models beyond translational motion and therefore improves motion segmentations. In collaboration with the group of Thomas Brox at Freiburg University, we proposed to jointly learn estimations of optical flow, motion boundaries and occlusion regions. The resulting cues can directly be used to improve the estimation of points trajectories in videos such that motion segmentations based on these trajectories are improved. Further, we have proposed a new solver for LMC which adapts a pre-existing fast greedy heuristic for the MC using a balancing criterion so that it yields practical results for the LMC. As suggested by our reviewers, we investigated not only natural images but also data from the medical domain, specifically electron microscopic stacks of neuronal structures for the evaluation of our method. We also conducted an evaluation of the different available low-level cues and their impact on video object segmentation tracking by proposing a simple optical flow based tracking approach. We have also investigated the potential of learning graph weights and dense segmentations for sparse ones as well as predicting the uncertainties of resulting motion segmentations. Last, we have addressed the problem of closed boundary prediction in single images and proposed a method that employs MC constraints in of a neural network. The improved model predicts cleaner edge maps with less trailing, open boundaries and allows for improved segmentation. We consolidated our work on higher-order LMC for motion segmentation and extended it to multiple model fitting.

Projektbezogene Publikationen (Auswahl)

 
 

Zusatzinformationen

Textvergrößerung und Kontrastanpassung