Project Details
Projekt Print View

Data analysis infrastructure MetaRbolomics4Galaxy

Subject Area Bioinformatics and Theoretical Biology
Medical Informatics and Medical Bioinformatics
Term since 2025
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 564004112
 
The profiling and quantification of small molecules from mass spectrometry (MS) data is an important task in several disciplines, e.g. metabolomics (biochemistry), chemical diversity (ecology and biodiversity), composition and contamination of food, wastewater analysis and freshwater monitoring (environmental research). Computational Mass Spectrometry methods provide tools and algorithms to quantify and statistically analyse small molecules in the experimental context. In a second step, based on MS/MS data, the relevant subset of metabolites are identified or annotated with metabolite class information for the (bio-)chemical interpretation of the experimental hypothesis. The worldwide Bioconductor community is developing and maintaining a rich set of R packages for precise and repeatable analysis of biological data from various areas in the life-sciences. The worldwide Galaxy community is developing the web-based Galaxy workflow e-Research ecosystem, which allows to execute tools from its toolsched on either public or private instances, oftentimes powered by institutional (e.g., the Eurac Research HPC Cluster), regional (e.g., the cluster in South Tyrol) or national HPC infrastructures such as the de.NBI and ELIXIR-DE usegalaxy.eu instance at the University of Freiburg. Local execution of analyses is particularly important to enable analyses of individual-level, health-related human data within the legal GDPR framework. In the MetaRbolomics4Galaxy project, we will develop and expand a Bioconductor package ecosystem for large scale, cloud-enabled and reproducible metabolomics data analysis, along with their integration into the Galaxy e-Research system. User-friendly applications providing high-quality and interactive visualisation and analysis will also be adapted and integrated into the Galaxy system. An example application is MetFamily developed at the IPB Halle, which was one of the first tools for the integrated analysis of quantification MS and chemical characterisation MS/MS data. In addition to deployment to global HPC infrastructures, the developed tools will also be installed on regional HPC systems to serve the needs of local researchers and empower their research. We will use software engineering best practices, including an upstream-first approach, continuous integration for code, test coverage and documentation rendering; software containers for packaging, including software-bills of materials and enabling the use of high-performance computing to facilitate analyses of large data sets. The networking activities will exchange these best practices with research software developers in the R, Galaxy and metabolomics communities.
DFG Programme Research data and software (Scientific Library Services and Information Systems)
International Connection Italy
Partner Organisation Autonome Provinz Bozen - Südtirol
Co-Investigator Dr. Henriette Uthe
Cooperation Partner Dr. Johannes Rainer
 
 

Additional Information

Textvergrößerung und Kontrastanpassung