Video Segmentation from Multiple Representations using Lifted Multicuts
Final Report Abstract
The goal of this project has been to address the task of temporally consistent video segmentation based on graphical models, specifically minimum cost multicuts (MC) and minimum cost lifted muliticuts (LMC). The MC formulation allows to define attractive and repulsive terms and thus facilitate even the segmentation of small objects. They have proven very powerful in the context of motion segmentation even when using only low-level cues such as pairwise motion cues and color. Lifted multicuts build on the definition of the standard MC and allow to additionally define long range dependencies without changing the set of feasible solutions. In this project, we have extended and applied the MC and LMC framework in the context of video and motion segmentation to leverage its flexibility. We envisioned to investigate the inclusion of a variety of segmentation cues such as point trajectories, motion boundaries and occlusion regions and produce video and motion segmentations using a joint optimization approach. The first contributions we made w.r.t. these goals were made while continuing previously existing research directions and collaborations. Specifically, we have continued to work on a joint model for motion segmentation and multiple object tracking which was published as a technical report by the time of proposal writing. Therein, we built a joint MC based on point trajectories which are connected via color and motion cues as well as object detections represented by bounding boxes and potentially their respective object segmentation masks. It also was the winning approach of the Multiple Object Tracking Challenge held in the CVPR 2017 workshop on Multiple Object Tracking. Similarly, we proposed a first extension of our efficient LMC solver to facilitate the definition of higher order edges and higher order (lifted) edges. This allows the representation of motion models beyond translational motion and therefore improves motion segmentations. In collaboration with the group of Thomas Brox at Freiburg University, we proposed to jointly learn estimations of optical flow, motion boundaries and occlusion regions. The resulting cues can directly be used to improve the estimation of points trajectories in videos such that motion segmentations based on these trajectories are improved. Further, we have proposed a new solver for LMC which adapts a pre-existing fast greedy heuristic for the MC using a balancing criterion so that it yields practical results for the LMC. As suggested by our reviewers, we investigated not only natural images but also data from the medical domain, specifically electron microscopic stacks of neuronal structures for the evaluation of our method. We also conducted an evaluation of the different available low-level cues and their impact on video object segmentation tracking by proposing a simple optical flow based tracking approach. We have also investigated the potential of learning graph weights and dense segmentations for sparse ones as well as predicting the uncertainties of resulting motion segmentations. Last, we have addressed the problem of closed boundary prediction in single images and proposed a method that employs MC constraints in of a neural network. The improved model predicts cleaner edge maps with less trailing, open boundaries and allows for improved segmentation. We consolidated our work on higher-order LMC for motion segmentation and extended it to multiple model fitting.
Publications
- “Higher-Order Minimum Cost Lifted Multicuts for Motion Segmentation”. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV). Oct. 2017
Margret Keuper
(See online at https://doi.org/10.1109/iccv.2017.455) - “Motion Segmentation Multiple Object Tracking by Correlation Co-Clustering”. In: IEEE Transactions on Pattern Analysis and Machine Intelligence 42.1 (2018), pp. 140–153
Margret Keuper et al.
(See online at https://doi.org/10.1109/tpami.2018.2876253) - “Occlusions, Motion and Depth Boundaries with a Generic Net-work for Disparity, Optical Flow or Scene Flow Estimation”. In: Proceedings of the European Conference on Computer Vision (ECCV). Sept. 2018
Eddy Ilg et al.
(See online at https://doi.org/10.1007/978-3-030-01258-8_38) - “Solving Minimum Cost Lifted Multicut Problems by Node Agglomeration”. In: Computer Vision – ACCV 2018. Cham: Springer International Publishing, 2019
Amirhossein Kardoost and Margret Keuper
(See online at https://doi.org/10.1007/978-3-030-20870-7_5) - “A Two-Stage Minimum Cost Multicut Approach to Self-Supervised Multiple Person Tracking”. In: Proceedings of the Asian Conference on Computer Vision (ACCV). Nov. 2020
Kalun Ho et al.
(See online at https://doi.org/10.1007/978-3-030-69532-3_33) - “Self-supervised Sparse to Dense Motion Segmentation”. In: Proceedings of the Asian Conference on Computer Vision (ACCV). Nov. 2020
Amirhossein Kardoost et al.
(See online at https://doi.org/10.1007/978-3-030-69532-3_26) - “Object Segmentation Tracking from Generic VideoCues”. In: 2020 25th International Conference on Pattern Recognition (ICPR). Los Alamitos, CA, USA: IEEE Computer Society, 2021
Amirhossein Kardoost et al.
(See online at https://doi.org/10.1109/icpr48806.2021.9413089) - “Optimizing Edge Detection for Image Segmentation with Multicut Penalties”. 2021
Steffen Jung Sebastian Ziegler, Amirhossein Kardoost, and Margret Keuper
(See online at https://doi.org/10.48550/arXiv.2112.05416) - “Uncertainty in Minimum Cost Multicuts for Image and Motion Segmentation”. In: Proceedings of the 37th Conference on Uncertainty in Artificial Intelligence (UAI). 2021
Amirhossein Kardoost and Margret Keuper