Efficient Semantic Search on Big Data

Applicant Professorin Dr. Hannah Bast

Subject Area Theoretical Computer Science
Data Management, Data-Intensive Systems, Computer Science Methods in Business Informatics
Security and Dependability, Operating-, Communication- and Distributed Systems

Term from 2014 to 2020

Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 254890286

Project Description

This project is about efficient semantic search on big data, notably very large text collections and very large knowledge bases. In the first round of this SPP we have made the following contributions: a new search engine for interactive combined search on text and knowledge bases; a new scalable algorithm for decomposing text into its semantically coherent units; a new framework and algorithm for computing relevance scores for knowledge base triples; a self-learning question answering system for automatically translating natural-language questions into knowledge base queries;a comprehensive survey on the vast field of semantic search on text and knowledge bases.In the next round of this SPP, we plan to develop improved solutions for some of these problems, as well as solutions for new problems that cropped up during our work in the first round: a full-featured SPARQL+Text engine (existing SPARQL engines have only moderately powerful text-search extensions, our engine from the first round supports only tree-shaped queries and relies on their incremental construction); an extension of our question answering system to query patterns that are more complex and may involve a text-search component; a system for automatic completion of natural-language questions; improved large-scale named entity recognition and disambiguation for semantic search.For all the named problems, our goals are (like in the first round): provable efficient algorithms and data structures; an extensive experimental evaluation of their efficiency and quality; open-sourced software and a publicly accessible demonstrator or prototype;full reproducibility of our results by providing all relevant materials (if possible) or a dedicated web application.

DFG Programme Priority Programmes

Subproject of SPP 1736: Algorithms for Big Data

Servicenavigation

Hauptnavigation

Efficient Semantic Search on Big Data

Additional Information

Servicenavigation

Hauptnavigation

Efficient Semantic Search on Big Data

Additional Information

Textvergrößerung und Kontrastanpassung