Detailseite
Projekt Druckansicht

Querschnitte: XML und formale Sprachen - Theorie und Praxis

Fachliche Zuordnung Theoretische Informatik
Förderung Förderung von 2011 bis 2019
Projektkennung Deutsche Forschungsgemeinschaft (DFG) - Projektnummer 185161317
 
Erstellungsjahr 2018

Zusammenfassung der Projektergebnisse

The Extensible Markup Language (XML) is the standard data format for exchanging data on the World Wide Web. The loose structure of XML is in strong contrast with the rigid structure of relational databases and therefore, the data format XML has posed a wealth of new challenges in the data management community. Since the foundations of XML are based on formal languages (regular expressions, finite automata, regular tree languages, etc.), the deeper research questions in the area are often closely tied to foundational work in formal language theory. This project had two core goals: we want to strengthen the connections between XML research and formal language research and we want to strengthen the ties between theory and practice. These ties between theory and practice should be applicable both in XML research and in formal language research. Both of these goals have been achieved. We have made substantial contributions to formal language theory by solving problems that were motivated by XML research. Examples of such problems were questions about determinism in regular expressions. We solved several openstanding problems in this area, such as the complexity of determining if a regular language is definable by a deterministic regular expression (as used in DTD), and the question if a language is definable by a deterministic regular expression with counters (as used in XML Schema). Another major contribution to formal language theory was the proof that regular languages can be separated by piecewise testable languages in polynomial time. This was the first (nontrivial) tractable separability result in the literature. In the area of incremental query evaluation, we proved that answers to MSO queries on tree-structured data can be enumerated in logarithmic delay, using linear preprocessing time. We further made substantial contributions to database theory by solving the two major open problems concerning tree pattern optimization: the question whether minimality equals nonredundancy, and the computational complexity of the problem. We have also substantially contributed to strengthening the ties between theory and practice. We have developed, from the ground up, a new schema language for XML data, called bonXai. This development spurred theoretical questions but also led to an implementation of a freely available tool for development and validation of bonXai schemas. The tool can be used not only to develop schemas but also to analyse and better understand XML Schema definitions.

Projektbezogene Publikationen (Auswahl)

 
 

Zusatzinformationen

Textvergrößerung und Kontrastanpassung