Project Details
Projekt Print View

Methods and Tools to Advance the Retrieval of Mathematical Knowledge from Digital Libraries for Search-, Recommendation- and Assistance-Systems

Subject Area Data Management, Data-Intensive Systems, Computer Science Methods in Business Informatics
Term from 2017 to 2022
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 350192710
 
The goal of our project is to investigate fundamental methods and tools for making mathematical knowledge accessible to information retrieval tools. Achieving this goal requires methods to reliably extract mathematical knowledge from documents. In the domain of natural language processing (NLP), a number of well-established, general purpose text processing methods and tools exist that are applied to a text to enable domain specific extraction tasks. Similar to state-of-the-art text processing tools, such as the Stanford NLP toolkit, our research will determine how similar tools for processing mathematical language can be realized.Our approach is to expand upon the concept of Mathematical Language Processing (MLP), a concept for which we have already demonstrated its feasibility when we presented it at the ACM SIGIR 2016. In the context of this project, we will expand upon our preliminary research to make the approach more effective and applicable for real world mathematical information retrieval applications. Specifically, the project has the following objectives:1. Identify mathematical formulae and expressions in documents, and reliably differentiate them from similar or neighboring structures.2. Perform type detection and tokenization of mathematical expressions.3. Extract the corresponding mathematical concepts from the tokenized mathematical formulae and expressions.Our goal is to enable other scientists to use our methods and tools for mathematical language processing to tackle their own novel problems. We hope that MLP will continue to improve in this process, as was once the case for early NLP approaches.A wide variety of applications would benefit from advancements to mathematical information retrieval. In the STEM disciplines, improvements could be made to academic literature search, literature recommendation, and even plagiarism prevention. Additionally, expert search or applications in pure mathematics, such as theorem search or definition lookup, would significantly benefit from our developments. Applications beyond STEM fields include the improvement of tutoring assistance tools, as well as patent search and enterprise search, which could become more valuable to companies if they integrate math-aware information retrieval methods.
DFG Programme Research Grants
International Connection USA
 
 

Additional Information

Textvergrößerung und Kontrastanpassung