Project Details
Projekt Print View

Querschnitte: XML und formale Sprachen - Theorie und Praxis

Subject Area Theoretical Computer Science
Term from 2011 to 2019
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 185161317
 
Final Report Year 2018

Final Report Abstract

The Extensible Markup Language (XML) is the standard data format for exchanging data on the World Wide Web. The loose structure of XML is in strong contrast with the rigid structure of relational databases and therefore, the data format XML has posed a wealth of new challenges in the data management community. Since the foundations of XML are based on formal languages (regular expressions, finite automata, regular tree languages, etc.), the deeper research questions in the area are often closely tied to foundational work in formal language theory. This project had two core goals: we want to strengthen the connections between XML research and formal language research and we want to strengthen the ties between theory and practice. These ties between theory and practice should be applicable both in XML research and in formal language research. Both of these goals have been achieved. We have made substantial contributions to formal language theory by solving problems that were motivated by XML research. Examples of such problems were questions about determinism in regular expressions. We solved several openstanding problems in this area, such as the complexity of determining if a regular language is definable by a deterministic regular expression (as used in DTD), and the question if a language is definable by a deterministic regular expression with counters (as used in XML Schema). Another major contribution to formal language theory was the proof that regular languages can be separated by piecewise testable languages in polynomial time. This was the first (nontrivial) tractable separability result in the literature. In the area of incremental query evaluation, we proved that answers to MSO queries on tree-structured data can be enumerated in logarithmic delay, using linear preprocessing time. We further made substantial contributions to database theory by solving the two major open problems concerning tree pattern optimization: the question whether minimality equals nonredundancy, and the computational complexity of the problem. We have also substantially contributed to strengthening the ties between theory and practice. We have developed, from the ground up, a new schema language for XML data, called bonXai. This development spurred theoretical questions but also led to an implementation of a freely available tool for development and validation of bonXai schemas. The tool can be used not only to develop schemas but also to analyse and better understand XML Schema definitions.

Publications

  • Efficient Separability of Regular Languages by Subsequences and Suffixes. In International Conference on Automata, Languages, and Programming (ICALP), pages 150–161, 2013
    Wojciech Czerwinski, Wim Martens, and Tomas Masopust
    (See online at https://doi.org/10.1007/978-3-642-39212-2_16)
  • Simplifying XML Schema: Single-type approximations of regular tree languages. J. Comput. Syst. Sci., 79(6): 910–936, 2013
    Wouter Gelade, Tomasz Idziaszek, Wim Martens, Frank Neven, Jan Paredaens
    (See online at https://doi.org/10.1016/j.jcss.2013.01.009)
  • MSO queries on trees: enumerating answers under updates. In Joint Meeting of the Conference on Computer Science Logic (CSL) and the ACM/IEEE Symposium on Logic in Computer Science (LICS), pages 67:1–67:10, 2014
    Katja Losemann, Wim Martens
    (See online at https://doi.org/10.1145/2603088.2603137)
  • Definability by Weakly Deterministic Regular Expressions with Counters is Decidable. In Mathematical Foundations of Computer Science (MFCS), pages 369–381, 2015
    Markus Latte, Matthias Niewerth
    (See online at https://doi.org/10.1007/978-3-662-48057-1_29)
  • Closure properties and descriptional complexity of deterministic regular expressions. Theor. Comput. Sci. 627: 54–70, 2016
    Katja Losemann, Wim Martens, Matthias Niewerth
    (See online at https://doi.org/10.1016/j.tcs.2016.02.027)
  • Querying Graphs with Data. J. ACM, 63(2), 14:1–14:53, 2016
    Leonid Libkin, Wim Martens, and Domagoj Vrgoč
    (See online at https://doi.org/10.1145/2850413)
  • BonXai: Combining the Simplicity of DTD with the Expressiveness of XML Schema. ACM Trans. Database Syst., 42(3): 15:1–15:42, 2017
    Wim Martens, Frank Neven, Matthias Niewerth, Thomas Schwentick
    (See online at https://doi.org/10.1145/3105960)
  • Deciding definability by deterministic regular expressions. J. Comput. Syst. Sci. 88: 75–89, 2017
    Wojciech Czerwinski, Claire David, Katja Losemann, Wim Martens
    (See online at https://doi.org/10.1016/j.jcss.2017.03.011)
  • Minimization of Tree Patterns. J. ACM 65(4):26:1–26:46, 2018
    Wojciech Czerwinski, Wim Martens, Matthias Niewerth, Pawel Parys
    (See online at https://doi.org/10.1145/3180281)
  • Reasoning about XML Constraints based on XML-to-relational mappings. Theory Comput. Syst. 62(8): 1826–1879, 2018
    Matthias Niewerth, Thomas Schwentick
    (See online at https://doi.org/10.1007/s00224-018-9846-5)
 
 

Additional Information

Textvergrößerung und Kontrastanpassung