Project Details
Projekt Print View

Deep Models for Handheld Light Field Acquisition

Subject Area Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Term from 2020 to 2024
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 437172262
 
Final Report Year 2024

Final Report Abstract

Light fields capture 3D scenes from multiple view points and have applications in precise free viewpoint rendering of a scene or the estimation of geometries or materials of objects. The aim of this project was to develop generalizable deep models for light field representation and algorithms for light field reconstruction, in contrast to existing approaches that consider fixed acquisition setups. In the course of this research project, important scientific results were achieved in developing novel generative models for light fields, implicit representation for light fields, text guided generalizable image reconstruction and manipulation and robustness of image reconstruction methods, with an overarching theme of developing flexible, robust models for light field and image recovery. We developed the first generative model for light fields, a generative autoencoder conditioned on the central view. We utilized this model as a prior for light field recovery for diverse tasks including light field view synthesis, spatial-angular super resolution, and recovery from coded projections, with advantages in terms of flexibility and robustness in comparison with end-to-end trained networks. We proposed a neural implicit representation for 4D light fields conditioned on a sparse set of input views to produce the light field values for a continuous range of query spatio-angular coordinates. This scheme could super-resolve sparse set of input views to any desired spatial and angular resolution, and can additionally handle corrupt input views with missing pixels. Further, we demonstrated the use of text conditioned image diffusion models for image restoration and manipulation. We devised a fast, zero-shot method for text-guided image manipulation that maintains content consistency without further optimization or fine-tuning. We proposed text-guided flexible image super-resolution to generate semantically accurate reconstructions that maintain data consistency with low-resolution inputs. Our approach produced diverse solutions that are semantically aligned with the input text, while maintaining consistency with the degraded images for flexible upsampling factors. We investigated the robustness of deep networks for image recovery considering image deblurring and computed tomography. We investigated the robustness to different adversarial attacks, studied the transferability of attacks across methods, and further studied the effect of architectural components on adversarial robustness. We further showed that localized attacks can be used in a beneficial manner to explore solutions to ill-posed reconstruction problems.

Publications

 
 

Additional Information

Textvergrößerung und Kontrastanpassung