Project Details
Projekt Print View

Pantheon: Efficiently Creating and Maintaining Semantically Meaningful Entity Rankings at Large-Scale

Subject Area Security and Dependability, Operating-, Communication- and Distributed Systems
Term from 2013 to 2020
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 241616207
 
Rankings are an essential methodology to summarize the key facets of data. Particularly appealing are rankings that compare entities, for instance, products, people, companies, or countries, according to characteristics like revenue or number of inhabitants. Our efforts within the DFG-funded Pantheon project (MI 1794/1-1) aim at developing methods that allow generating and maintaining semantically meaningful entity rankings. For instance, the top-10 of the tallest buildings in Europe or the top-20 universities with respect to the number of their Nobel laureates; using information sources such as knowledge bases. Since the beginning of the project, we have achieved notable research results, published at premier international conferences and workshops. In this proposal for prolongation, we sketch two additional research fields that we believe are essential add-ons to Pantheon, specifically to the reliability of obtained rankings and support for data exploration through rankings, therefore, raising the expected impact to end-users and researchers. First, we aim at assessing the degree of incompleteness of rankings, that is, quantifying the trustworthiness in the sense of stability of rankings with respect to missing information that could potentially distort the ranking. Rankings that have only negligible risk to be created over too small sets of reliable information are deemed safe to be presented to users. Others are marked incomplete/unsafe and need to be interpreted with care. Second, we plan to investigate ways of using rankings as means to explore previously barely known datasets or databases. This specifically requires methods to efficiently determine similar or complementary rankings, to form clusters of rankings through similarity self joins and to further consider user preferences that allow emphasizing on specific relations among entities and on preferences of ranking attributes. We will also investigate the use of rank-aggregation methods to collapse multiple rankings into one and allow similarity search also across such synthetically generated additional rankings. The user-driven exploratory character of this attempt will render preprocessing difficult, thus, requires efficient methods that can allow ad-hoc, low-latency interactions with the system.
DFG Programme Research Grants
 
 

Additional Information

Textvergrößerung und Kontrastanpassung