Project Details
Computational Analysis of Music Audio Recordings: A Cross-Version Approach
Applicant
Professor Dr.-Ing. Christof Weiß
Subject Area
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Term
since 2023
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 531250483
The computational analysis of music audio recordings constitutes a highly interdisciplinary research area, involving domain knowledge from musicology and music theory as well as methods from signal processing and machine learning. From a computer science perspective, the variety and complexity of music audio data poses enormous challenges, which are specific to this domain. First, music is multi-faceted, being characterized by different semantic dimensions such as time, pitch, timbre, or style, which need to be disengtangled to obtain interpretable representations. Second, music analysis comprises hierarchically related tasks such as the estimation of pitches, chords, local keys, and global keys or the detection of onsets, beats, downbeats, and structural boundaries, suggesting the use of multi-task approaches. Third, music data is complex, consisting of highly correlated sources whose components overlap in time and frequency. Furthermore, musical notions are often ambiguous and subjective, thus demanding for interpretable methods and multiple annotators. Fourth, music scenarios are often data-scarce, lacking the availability of large amounts of annotated data. As a consequence, analysis methods frequently overfit to implicit biases in the training datasets, become sensitive to small perturbations, and do not generalize well to unseen data. The data scarcity poses particular challenges for deep-learning approaches, which are nowadays dominating the field. These approaches achieved substantial improvements for many music analysis tasks but often hit a kind of "glass ceiling" above which further progress is hard to achieve and to measure. To overcome this problem, this project adopts a cross-version approach by exploiting datasets of classical music, which contain several modalities (score and audio), several performances (interpretations and arrangements), and several annotations (multiple experts) for each musical work. Such datasets allow for transferring annotations between versions and for systematically evaluating the robustness of deep-learning methods by testing generalization along different dimensions, e. g., to other versions of a work, other works by a composer, or other composers from a historical period. As a main conceptual contribution, we apply and further develop such cross-version strategies, exploiting them to better understand the analysis methods and to improve these methods using suitable training and fusion strategies. Based on this cross-version approach, we address the specific challenges of music data, aiming for analysis methods that are of particular use for computational musicology, and progressing towards novel methodological strategies in the wider field of the digital humanities.
DFG Programme
Independent Junior Research Groups