Source Separation and Restoration of Sound Components in Music Recordings
Data Management, Data-Intensive Systems, Computer Science Methods in Business Informatics
Final Report Abstract
The SeReCo project advanced machine learning techniques for the separation and restoration of sound components in complex music recordings. Initially centered on drum recordings, the project expanded to tackle other challenging music scenarios, including piano and string music. A particular focus was on separating piano concertos into distinct piano and orchestral parts, a scenario that poses specific challenges due to the intricate interplay between the various instruments. This novel approach was explored in Music Information Retrieval (MIR) for the first time, moving beyond the traditional focus on source separation in popular music. The SeReCo project made substantial contributions across technical, conceptual, and practical aspects, highlighting the potential of deep learning techniques within a musical context. First, it developed advanced data-driven machine learning algorithms for source separation and audio decomposition, enhanced by musical knowledge and classical signal processing methods. Building on this foundation, the project tackled the complex task of separating piano concertos into individual piano and orchestral tracks. To address the challenge of limited training data for deep learning models, the project introduced musically motivated data augmentation techniques, which significantly improved the performance of source separation algorithms, particularly in scenarios with high spectro–temporal correlations. For evaluation purposes, the project created a multitrack dataset for piano concertos, featuring synchronized orchestral and piano tracks performed by both professional and amateur pianists. This dataset serves as a valuable resource for evaluating source separation models and has broader applications in MIR. Additionally, the project utilized score-informed Nonnegative Matrix Factorization (NMF) to develop a notewise signal-to-distortion ratio (SDR) measure, providing deeper insights into various source separation artifacts. The impact of the SeReCo project extends beyond its technical innovations. It resulted in a rich collection of open-source resources, including well-documented Python toolboxes, datasets, and supplementary materials such as audio examples and implementations. The project also provided demonstrators to illustrate potential applications, such as tools for automatically creating orchestral accompaniments. Overall, this project not only addressed a novel challenge in MIR and audio signal processing but also significantly expanded the possibilities for interaction between pianists and existing classical music performances.
Publications
-
Sync Toolbox: A Python package for efficient, robust, and accurate music synchronization. Journal of Open Source Software (JOSS), 6(64):3434:1–4, 2021
Meinard Müller, Yigitcan Ozer, Michael Krause, Thomas Prätzlich & Jonathan Driedger
-
Deep learning and knowledge integration for music audio analysis (Dagstuhl Seminar 22082). Dagstuhl Reports, 12(2):103–133, 2022
Meinard Müller, Rachel Bittner, Juhan Nam, Michael Krause & Yigitcan Ozer
-
Investigating Nonnegative Autoencoders for Efficient Audio Decomposition. 2022 30th European Signal Processing Conference (EUSIPCO), 254-258. IEEE.
Ozer, Yigitcan; Hansen, Jonathan; Zunner, Tim & Muller, Meinard
-
Source separation of piano concertos with test-time adaptation. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), pages 493–500, Bengaluru, India, 2022
Yigitcan Ozer & Meinard Müller
-
Using activation functions for improving measure-level audio synchronization. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), pages 749–756, Bengaluru, India, 2022
Yigitcan Özer, Matej Ištvánek, Vlora Arifi-Müller & Meinard Müller
-
High-resolution violin transcription using weak labels. In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), pages 223–230, Milano, Italy, 2023
Nazif Can Tamer, Yigitcan Ozer, Meinard Müller & Xavier Serra
-
Piano Concerto Dataset (PCD): A Multitrack Dataset of Piano Concertos. Transactions of the International Society for Music Information Retrieval, 6(1), 75-88.
Özer, Yigitcan; Schwär, Simon; Arifi-Müller, Vlora; Lawrence, Jeremy; Sen, Emre & Müller, Meinard
-
TAPE: An End-to-End Timbre-Aware Pitch Estimator. ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 1-5. IEEE.
Tamer, Nazif Can; Özer, Yigitcan; Müller, Meinard & Serra, Xavier
-
Source Separation of Piano Concertos Using Musically Motivated Augmentation Techniques. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 32(2024), 1214-1225.
Özer, Yigitcan & Müller, Meinard
