Project Details
Projekt Print View

Advancing Computational Musicology: Semi-supervised and unsupervised segmentation and annotation of musical collections (ACMus)

Applicant Professor Dr.-Ing. Karlheinz Brandenburg, since 12/2018
Subject Area Electronic Semiconductors, Components and Circuits, Integrated Systems, Sensor Technology, Theoretical Electrical Engineering
Musicology
Term from 2018 to 2023
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 403542342
 
Final Report Year 2022

Final Report Abstract

The research project Advancing Computational Musicology: Semi-supervised and unsupervised segmentation and annotation of musical collections (ACMus) was an international and interdisciplinary collaboration between Colombian and German research institutes with a team of data scientist, professional musicians, and signal processing experts. The goal of ACMus was the creation of music analysis tools using semi- and unsupervised deep learning techniques for content with sparse available data. Hereby the main fields of interest were the classification of musical content based on the number of instruments, the rhythm, the vocals, the scale, and the meter. A Colombian archive of traditional music from the Andes region was the focus of research due to its challenging composition of unique rhythms and instrumentation, as well as the varying recording conditions. As a first step, an annotated dataset was created and published as the ACMUS-MIR dataset, and was continuously extended during the project. To ensure optimal annotations the process was led by professional musicians from Colombia. This annotations phase illustrated the expensive and time-consuming nature of this process even for small amounts of data, and emphasized the need of automated alternatives. Initial and promising results were achieved by applying the Transfer Learning technique to the classification tasks. This method trains neural networks in a data-rich domain to ensure that a sufficient amount of data is present for robust results. Subsequently, the knowledge (i.e. the trained network with its parameters) is transferred to the actual target task and fine-tuned on the sparse data. Furthermore, semi-supervised approaches initially developed for the field of computer vision, which include unlabeled data into the training process, were adapted to audio classification tasks. All of the applied methods significantly improved the classification performance and showed great potential for related future applications. With the ACMUS-MIR dataset a new annotated music collection is provided to the scientific community, bringing the focus to the unique regional and traditional music from the Andes. The applied techniques and classification models developed within the project show great potential for future research and applications. There are challenges for future work in simplifying the tools to make them usable by non-technical experts and allowing the adaption to novel domains. Future work can shift the focus to related areas of sparse data to evaluate the generalization of these methods and enlarge the potential user community.

Publications

  • “ACMUS - Advancing Computational Musicology: Semi-supervised and Unsupervised Segmentation and Annotation of Musical Collections,” in ISMIR Latebreaking-demo, Delft, The Netherlands, 2019
    E. Cano, A. Escamilla, S. Grollmisch, C. Kehling, F. Mora-Ángel, G. López Gil, and J. R. Zapata
  • “ACMUS-MIR: An Annotated Dataset of Andean Colombian Music,” in In 7th International Conference on Digital Libraries for Musicology, Delft, The Netherlands, 2019
    F. Mora-Ángel, G. A. L. Gil, E. Cano, and S. Grollmisch
  • “Ensemble size classification in Colombian Andean string music recordings,” in CMMR, Marseille, France, 2019
    S. Grollmisch, E. Cano, F. Mora-Ángel, and G. López Gil
    (See online at https://doi.org/10.1007/978-3-030-70210-6_4)
  • “Analyzing the Potential of Pre-Trained Embeddings for Audio Classification Tasks,” in EUSIPCO, Amsterdam, The Netherlands, 2020
    S. Grollmisch, E. Cano, C. Kehling, and M. Taenzer
    (See online at https://doi.org/10.23919/Eusipco47968.2020.9287743)
  • “Sesquialtera in the Colombian Bambuco: Perception and Estimation of Beat and Meter,” in ISMIR, Montreal, Canada, 2020
    E. Cano, F. Mora-Ángel, G. López Gil, J. R. Zapata, A. Escamilla, J. F. Alzate, and M. Betancur
    (See online at https://doi.org/10.5334/tismir.118)
  • “Techniques Improving the Robustness of Deep Learning Models for Industrial Sound Analysis,” in EUSIPCO, Amsterdam, The Netherlands, 2020
    D. Johnson and S. Grollmisch
    (See online at https://doi.org/10.23919/Eusipco47968.2020.9287327)
  • “Ensemble Size Classification in Colombian Andean String Music Recordings,” in Perception, Representations, Image, Sound, Music, R. Kronland-Martinet, S. Ystad, and M. Aramaki, Eds. Cham, Switzerland: Springer International Publishing, 2021, pp. 60–74
    S. Grollmisch, E. Cano, F. Mora-Ángel, and G. López Gil
    (See online at https://doi.org/10.1007/978-3-030-70210-6_4)
  • “Improving Semi-Supervised Learning for Audio Classification with FixMatch,” Electronics, vol. 10, no. 15, 2021
    S. Grollmisch and E. Cano
    (See online at https://doi.org/10.3390/electronics10151807)
  • “Knowledge Transfer from Neural Networks for Speech Music Classification,” in 15th International Symposium on Computer Music Multidisciplinary Research, Tokyo, Japan, 2021
    C. Kehling and E. Cano
  • “Sesquialtera in the Colombian Bambuco: Perception and Estimation of Beat and Meter – Extended version,” Transactions of the International Society for Music Information Retrieval, vol. 4, no. 1, 2021
    E. Cano, F. Mora-Ángel, G. López Gil, J. R. Zapata, A. Escamilla, J. F. Alzate, and M. Betancur
    (See online at https://doi.org/10.5334/tismir.118)
 
 

Additional Information

Textvergrößerung und Kontrastanpassung