Project Details
Projekt Print View

Connected Open-Source Software (ConnOSS)

Subject Area Methods in Artificial Intelligence and Machine Learning
Software Engineering and Programming Languages
Term since 2025
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 561044496
 
Machine-actionable metadata for research artifacts is key to realize the FAIR Guiding Principles, contributing also to good practices as well as better quality and reproducibility. In addressing researchers developing research software, we identified that the majority would be interested in automated ways producing an overview on their software, increasing FAIRness without requiring much additional effort on their side. In recent years, the scientific community has turned to schema.org, a controlled vocabulary that makes it easier to provide FAIR descriptions of datasets, software and publications, among others. Efforts built on top of schema.org such as Codemeta, Bioschemas and the Machine-actionable Software Management Plan Metadata Schema aim to improve the metadata descriptions of research software. To make it easier for researchers, some efforts on (semi)automatic metadata extraction, mostly from GitHub, have emerged. However, none of the current efforts provide a high metadata coverage partially due to the multiple sources to be considered and harmonized. While there is some structured metadata available, e.g., from the GitHub API and citation files, there is a need to extend the metadata coverage with Machine Learning approaches using other sources (e.g., analyzing the README file). The Connected Open Source Software project (ConnOSS) will provide a GitHub/GitLab web pages infrastructure showcasing software production by research groups with consistent, harmonized, and enriched machine-actionable metadata; thus, improving the visibility and FAIRness of research software. In this way, it will make it easier for researchers to align to good practices and thus improve quality and reproducibility of research software, and for aggregators and registries to harvest high quality metadata, enabling the creation of, e.g., knowledge graphs. ConnOSS will build on top of existing approaches and will add a machine learning approach to enrich the metadata. It will also support researchers with the adoption tutorials and appropriate training. Furthermore, ConnOSS infrastructure and Machine Learning model, together with the extracted metadata will adhere to FAIR and open-access practices.
DFG Programme Research data and software (Scientific Library Services and Information Systems)
 
 

Additional Information

Textvergrößerung und Kontrastanpassung