Project Details
Extracting Implicit Relations from Natural Language Text
Applicant
Professor Dr. Michael Roth
Subject Area
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Term
from 2015 to 2017
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 270126209
The focus of the proposed project lies on Relation Extraction (RE), the task of extracting relations between entities (e.g. x-worksAt-y, x-isDaughterOf-z) from natural language text. Methods from relation extraction are invaluable for data mining and other NLP applications, including the mining of protein interactions from biomedical literature and the induction of knowledge bases from the web. To achieve good generalisation, current state-of-the-art systems process and extract relations only from small units of texts (e.g. individual sentences). This approach contrasts with the fact that information in text is, in practice, often spread over distance. As a result, RE systems often fail to cover crucial information and relations. To illustrate this shortcoming, consider Example 1, in which an 'isDaughterOf' relation is expressed across two sentences: (1) "Emile has three children. Karine is the oldest daughter."As human beings, we are able to aggregate information across sentences and to infer relations in a precise manner. The goal of this project is to develop methods that enable the same kind of inference computationally. Towards this goal, the proposed work focusses on completing semantic predicate-argument structures that are only partially realised within a sentences. For instance, the nominal predicate "daughter" in Example 1 has two arguments, one of which is realised in a previous sentence. Three work packages are defined to reach the project's objective: in the first work package, a baseline model will be implemented that combines an existent RE system with data that contains automatically resolved non-local arguments; the second task will be to extend this model to a cross-lingual setting, in which language-specific omission phenomena can be exploited to resolve non-local arguments in complementary positions; in the third and final work package, a joint model is to be developed that combines inference mechanism from relation extraction and methods for resolving non-local arguments, in order to solve both tasks simultaneously.To achieve the goals outlined above, the proposed project is planned to build upon two state-of-the-art systems from the areas of relation extraction and semantic processing. More specifically, the RE system from Stanford University and the semantic parsing pipeline from the University of Stuttgart are to be applied and extended. For the resolution of non-local arguments, the project is planned to build upon a method developed in my dissertation, which utilizes alignments between pairs of comparable texts to identify and resolve missing arguments. In the final part of the project, a constraint-driven learning model will be designed that combines resolution methods and RE techniques, in order to allow a flow of information between the two tasks. The proposed project is expected to unveil synergy effects that will lead to better models for both relation extraction and non-local argument resolution.
DFG Programme
Research Fellowships
International Connection
United Kingdom, USA