Project Details
Learning-Based Wavelet Video Coding Using Deep Adaptive Lifting
Applicant
Professor Dr.-Ing. André Kaup
Subject Area
Communication Technology and Networks, High-Frequency Technology and Photonic Systems, Signal Processing and Machine Learning for Information Technology
Term
since 2021
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 461649014
The objective of this project is to investigate learning-based wavelet video coders as an explainable alternative to neural video coders based on nonlinear transform coding. Opposed to using neural networks as a black box, learning-based wavelet transforms provide trainable processing steps with a known structure. In a wavelet video coder, trainable wavelet transforms are used in temporal, horizontal, and vertical dimensions. As they are inherently scalable, the investigated video coder provides spatial as well as temporal scalability. The wavelet transforms are implemented using the lifting scheme that offers the advantage applying any non-linear operation without harming the reconstruction property of the transform. Therefore, a learned wavelet coder enables lossless reconstruction in contrast to general learned video coders. Given the advantages of learned wavelet video coders, three new wavelet video coding schemes will be investigated in this project. Based on the wavelet video coding framework, open and closed loop coding structures will be analyzed in a learning-based environment. Due to the recent success of conditional coding, a conditional residual wavelet coder will be investigated. For multi-view video coding, a learned wavelet coder that applies a wavelet transform along an additional fourth dimension - the view dimension – will be developed. Next to the new coding schemes, theoretical support will be provided by analyzing the properties of the learned wavelet transforms employed in different coding scenarios. While additionally considering the practical aspects of designing a more powerful wavelet coder, the project will answer the following central question: How good is the performance we can achieve with a learned video coding framework with a defined structure, and what are its advantages compared to other learned codecs?
DFG Programme
Research Grants
