Deutsche Vereinigung zur Kuration biologischer Daten (GFBio)
Zusammenfassung der Projektergebnisse
“GFBio is the authoritative, national contact point for issues concerning the management and standardization of biological and environmental research data during the entire data life cycle (from acquisition to archiving and data publication).” (Mission statement) Since 2013, the German Federation for Biological Data (GFBio), funded in three phases by the DFG, has been working on mobilizing and standardizing data relating to biodiversity research, and making it discoverable and reusable. GFBio comprised around 20 partner institutions from the scientific community that developed a common infrastructure and a common understanding of the challenges surrounding the management of biological research data. Together, the GFBio project partners developed tools and services for researchers around the data lifecycle, which describes the various steps from planning to data collection to publication. A challenge that had to be met during the work of GFBio was the development of a “common language” between the different project partners with their very diverse backgrounds from biology, collection data centers, and IT. In the initial project phases, this process of communication and building a common ground took a lot of efford, that then promised to be very useful. Understanding the needs of all the different partners was key to developing the user-oriented services, and is a great foundation for the ongoing NFDI4Biodiversity project that includes several other stakeholder groups. A central development of the collaboration in GFBio is the data portal which makes data from 10 affiliated data centers accessible and provides a central point of contact and advice on all issues related to standardization, publication and management of biological and ecological research data. It is the entry point to all the offered services, starting with general support on data management that is given by the GFBio expert network in the Helpdesk and information in the public GFBio Knowledge Base, where users can browse FAQs and Articles around the services offered and general data management issues. The training page and training activities such as Roadshows and specific workshops together with the social media presence and activity brought the community in contact with research data management and the GFBio services and support. GFBio offers a submission and brokerage service. Users can upload their (heterogeneous) research data, which are then evaluated by the data curators of the GFBio data centers and assigned to the respective data center(s). The data are then curated, archived, published with persistent identifiers and interlinked. The data archived at the GFBio data centers can be explored in the data portal that enables searching of the more than 15.5 million data sets. Many georeferenced datasets from the GFBio search can be visualized and analyzed with the Visualization, Analysis & Transformation Tool (VAT-tool), that also allows the upload and analysis of unpublished, user generated data. The VAT-tool has now been spun off under the name Geoengine and will be further developed. To enhance the interoperability of data, a Terminology Service was set up and developed, allowing semantic enrichment of the search strings as well as access to a large number of biological and environmental terminologies. Internally, work processes were standardized. The collection data centers agreed on GFBio consensus metadata and a common workflow for the publication of data in a common format. A harvesting workflow was developed and technical interfaces created to enable accessibility, harmonization, and (semantic) enrichment of the data. Overall, GFBio was successful in building up an infrastructure for the management and publication of biological and environmental data. and thus forms a solid basis for the National Research Data Infrastructure for Biodiversity Data, NFDI4Biodiversity.
