Project Details
Projekt Print View

Querschnitte: XML und formale Sprachen - Theorie und Praxis

Subject Area Theoretical Computer Science
Term from 2011 to 2019
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 185161317
 
Final Report Year 2018

Final Report Abstract

The Extensible Markup Language (XML) is the standard data format for exchanging data on the World Wide Web. The loose structure of XML is in strong contrast with the rigid structure of relational databases and therefore, the data format XML has posed a wealth of new challenges in the data management community. Since the foundations of XML are based on formal languages (regular expressions, finite automata, regular tree languages, etc.), the deeper research questions in the area are often closely tied to foundational work in formal language theory. This project had two core goals: we want to strengthen the connections between XML research and formal language research and we want to strengthen the ties between theory and practice. These ties between theory and practice should be applicable both in XML research and in formal language research. Both of these goals have been achieved. We have made substantial contributions to formal language theory by solving problems that were motivated by XML research. Examples of such problems were questions about determinism in regular expressions. We solved several openstanding problems in this area, such as the complexity of determining if a regular language is definable by a deterministic regular expression (as used in DTD), and the question if a language is definable by a deterministic regular expression with counters (as used in XML Schema). Another major contribution to formal language theory was the proof that regular languages can be separated by piecewise testable languages in polynomial time. This was the first (nontrivial) tractable separability result in the literature. In the area of incremental query evaluation, we proved that answers to MSO queries on tree-structured data can be enumerated in logarithmic delay, using linear preprocessing time. We further made substantial contributions to database theory by solving the two major open problems concerning tree pattern optimization: the question whether minimality equals nonredundancy, and the computational complexity of the problem. We have also substantially contributed to strengthening the ties between theory and practice. We have developed, from the ground up, a new schema language for XML data, called bonXai. This development spurred theoretical questions but also led to an implementation of a freely available tool for development and validation of bonXai schemas. The tool can be used not only to develop schemas but also to analyse and better understand XML Schema definitions.

Publications

 
 

Additional Information

Textvergrößerung und Kontrastanpassung