Project Details
Visual Analytics of Online Streaming Text
Applicant
Professor Dr. Thomas Ertl
Subject Area
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Term
from 2018 to 2023
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 392087235
Our global ways of communication are heavily based on the exchange of unstructured textual information. With the rise of short text messaging, social networks, and online news media, the frequent consumption and production of such data have become a defining element of our modern society. With this rise, there now is a broad range of application areas that could benefit from the ability to discover co-occurring topics, analyze correlated user behavior, and detect anomalous content in these data sources. However, at the same time, we are facing unprecedented threats introduced by the fast and uncontrolled global spread of misinformation and rumors. Malicious activities, such as the deployment of social bot networks to disseminate false claims or to sabotage public discourse, have been frequently observed. While in the past, automated content spreading algorithms were limited in their capabilities, we can now see advanced behavioral complexity, believable reactions, and sophisticated, orchestrated campaigns.In order to understand the evolution of content patterns, detect anomalous information, and discover large scale coordinated activities, we have to cope with the inherent challenges of real-time streaming text. While most of the past research has been directed towards batch corpus processing, only limited thought has been given to the challenge of analyzing live-streaming textual data. In this project we propose to close this gap by creating a novel Visual Analytics methodology that adapts and extends natural language processing, machine learning, and visual interfaces and integrates them into an interactive analytical pipeline. We will first acquire a suitable benchmark repository from social networks, news wire, and microblogs, and create a system that allows simulated replay to enable realistic evaluation scenarios. Using this system, we will investigate the real-time applicability, adaptability, and extensibility of existing visual metaphors and interaction patterns.Finally, to make the step from sampled analysis to large scale understanding of evolving topics, correlated behaviors, and sampling-uncertainties, we will integrate our visualization and interaction methods with specifically adapted text-mining tools. Here, we will particularly elaborate on the extensibility of generative content models as well as evolutionary topic hierarchy clustering. By integrating the existing tools as part of the visual interaction pipeline, we can allow task-centered pre-aggregation and filtering to reduce cognitive load for the analyst. Moreover, by opening the black box of established methods, we will integrate them with online visual configuration and control. Based on context knowledge, expertise, and intuition, the analyst will then be enabled to take back control of the iterative computational process, help the system to interpret intermediate results, and allow continuous alignment with analytical reasoning.
DFG Programme
Research Grants
International Connection
China
Partner Organisation
National Natural Science Foundation of China
Cooperation Partner
Yingcai Wu, Ph.D.