Digitisation / Cataloguing of non-textual objects: eScience-compliant standards for morphology
Final Report Abstract
In order to achieve our goal of a highly flexible application that fully exploits semantic technology, we have deviated substantially from the work plan originally outlined in the proposal (we got the approval from the DFG after consultation). Therefore, before we developed a prototype application for generating Anatomy Knowledge Graphs , we developed the Semantic Programming Ontology (SPrO) with an accompanying Java-based middleware that we used as a semantic programming language. Resources from SPrO function as commands, attributes, and variables and can be used for describing web-based data-centric applications, with each description forming an ontology in its own right, i.e., the application's source code ontology (SCO). The SCO thus provides the steering logic for the application. The accompanying middleware functions as an interpreter, treating descriptions in an SCO as specifications of the application and dynamically executes them. The middleware produces the application and controls its behavior. The Semantic Programming approach provides a development framework that not only seamlessly integrates RDF with HTML but also allows domain experts to develop their own data-centric applications with as little programming experience required as possible. With its clear separation of steering logic from interpretation logic, semantic programming follows the idea of separating main layers of an application, analog to the separation of interpretation logic and presentation logic. With SPrO and its accompanying middleware, semantic programming provides a basic development framework that supports developers of knowledge graph applications. We have used SPrO for describing a semantic web content management system (S-WCMS), called SOCCOMAS, that stores all data and metadata as semantic knowledge graphs. The SCO of SOCCOMAS contains descriptions of ready-to-use features and workflows typically required by many data-rich web applications, including user administration, login and user registration, session management, user profiles, publication-life-cycle processes (for current draft, backup, recycle bin, deleted draft, current published, and previously published versions), and automatic procedures for tracking overall provenance (creator, authors, creation and publication date, contributors, relation between different versions), and for tracking all changes made to a particular data record at the level of individual entry fields. All the gathered metadata are recorded as semantic knowledge graphs following established metadata standards. Users of SOCCOMAS do not have to interact directly with data in the form of semantic graphs, because SOCCOMAS makes the data and metadata contained in semantic knowledge graphs accessible through Websites. Since every document is published under a Creative Commons license, and since all data and metadata is documented as semantic knowledge graphs that are also accessible through a SPARQL endpoint, all data published by an S-WCMS run by SOCCOMAS meet Tim Berners-Lee's 5-star Linked Open Data principles and comply with the FAIR data principles. Using SOCCOMAS and semantic programming, we have developed a module for morphological descriptions for the morphological data repository Morph D Base, utilizing all from above. Additional features have been added through MDB’S own SCO. Semantic Morph·D·Base enables users to generate highly standardized and formalized morphological descriptions that are stored in a tuple store framework as semantic Anatomy Knowledge Graphs. When describing an anatomical structure, users can reference any ontology class from any anatomy ontology that is available at "BioPortal" and describe the structure and all of its parts as instances of these classes. Parts can be further described through defined data entry forms. Semantic Morph D Base is still in development, but a "prototype" can be accessed and functions as a proof of concept for SOCCOMAS and our semantic programming approach. Using SOCCOMAS and semantic programming for developing the module for morphological descriptions has proven to save valuable resources and development time. The SCO for the semantic Morph·D·Base prototype has been written by a domain expert with knowledge in ontology engineering but no expertise in any programming language. Furthermore, the approach has also proven that changes to the organization of the graphical user interface, such as adding a new entry field, can be conducted on the fly, which facilitates a user-centered design approach to application development. This has greatly reduces subsequent GUI-optimization workload. All code is available from our GitHub page . The project also resulted in a community effort to propose a common data model for "Anatomy". "Knowledge Graphs", the corresponding publication of which is currently in preparation.
Publications
-
Organizing phenotypic data -a semantic data model for anatomy, Journal of Biomedical Semantics
L. Vogt
-
SOCCOMAS: a FAIR Web Content Management System that uses Knowledge Graphs and that is based on Semantic Programming, DATABASE
L. Vogt, R. Baum, P. Bhatty, C. Köhler, S. Meid, B. Quast, and P. Grobe
-
(2015): Emerging semantics to link phenotype and environment, Peer J. 3: e1470
A.E. Thessen, D.E. Bunker, P.L. Buttigieg, L.D. Cooper, W.M. Dahdul, S. Domisch, N.M. Franz, P. Jaiswal, C.J. Lawrence-Dill, P.E. Midford, C.J. Mungall, M.J. Ramírez, C.D. Specht, L. Vogt, R.A. Vos, R.L. Walls, J.W. White, G. Zhang, A.R. Deans, E. Huala, S.E. Lewis, and P.M. Mabee
-
(2015): Finding Our Way through Phenotypes, PLoS Biology. 13: e1002033
A.R. Deans, S.E. Lewis, E. Huala, S.S. Anzaldo, M. Ashburner, J.P. Balhoff, D.C. Blackburn, J. a. Blake, J.G. Burleigh, B. Chanet, L.D. Cooper, M. Courtot, S. Csösz, H. Cui, W. Dahdul, S. Das, T.A. Dececchi, A. Dettai, R. Diogo, R.E. Druzinsky, M. Dumontier, N.M. Franz, F. Friedrich, G. V. Gkoutos, M. Haendel, L.J. Harmon, T.F. Hayamizu, Y. He, H.M. Hines, N. Ibrahim, L.M. Jackson, P. Jaiswal, C. James-Zorn, S. Köhler, G. Lecointre, H. Lapp, C.J. Lawrence, N. Le Novère, J.G. Lundberg, J. Macklin, A.R. Mast, P.E. Midford, I. Mikó, C.J. Mungall, A. Oellrich, D. Osumi-Sutherland, H. Parkinson, M.J. Ramírez, S. Richter, P.N. Robinson, A. Ruttenberg, K.S. Schulz, E. Segerdell, K.C. Seltmann, M.J. Sharkey, A.D. Smith, B. Smith, C.D. Specht, R.B. Squires, R.W. Thacker, A. Thessen, J. Fernandez-Triana, M. Vihinen, P.D. Vize, L. Vogt, C.E. Wall, R.L. Walls, M. Westerfeld, R. A. Wharton, C.S. Wirkner, J.B. Woolley, M.J. Yoder, A.M. Zorn, and P. Mabee
-
(2016): The word is not enough: on morphemes, characters and ontological concepts, Cladistics. 32(6): 682–690
T. Göpel, and S. Richter
-
(2017): Assessing similarity: on homology, characters and the need for a semantic approach to non-evolutionary comparative homology, Cladistics. 33: 513–539
L. Vogt
-
(2017): The first organ-based free ontology for arthropods (Ontology of Arthropod Circulatory Systems - OArCS) and its integration into a novel formalization scheme for morphological descriptions, Systematic Biology. 66(5): 754–768
C.S. Wirkner, T. Göpel, J. Runge, J. Keiler, B.-J. Klussmann-Fricke, K. Huckstorf, S. Scholz, I. Mikó, M. Yoder, and S. Richter
-
(2018): Taxonomy and the production of semantic phenotypes, in: A.E. Thessen (Ed.), Application of Semantic Technology in Biodiversity Science. Studies on the Semantic Web., IOS Press/AKA Verlag, Berlin: pp. 53–77
M.J. Yoder, M.B. Twidale, A.K. Thomas, L. Vogt, N.M. Franz, J. Guo, A.R. Deans, and J. Balhoff
-
(2018): The logical basis for coding ontologically dependent characters, Cladistics. 34: 438–458
L. Vogt
-
(2018): Towards a semantic approach to numerical tree inference in phylogenetics, Cladistics. 34: 200–224
L. Vogt
-
(2019): Bona fideness of material entities and their boundaries, in: R. Davies (Ed.), Natural and Artifactual Objects in Contemporary Metaphysics: Exercises in Analytical Ontology, Bloomsbury Academic, London: pp. 103–120
L. Vogt
-
(2019): Levels and building blocks—toward a domain granularity framework for the life sciences, Journal of Biomedical Semantics. 10: 1–29
L. Vogt
-
(2019): Using Semantic Programming for Developing a Web Content Management System for Semantic Phenotype Data, in: S. Auer, and M.-E. Vidal (Eds.), Data Integration in the Life Sciences - Lecture Notes in Computer Science Vol. 2994, Springer Nature, Berlin: pp. 200–206
L. Vogt, R. Baum, C. Köhler, S. Meid, B. Quast, and P. Grobe