Project Details
Episodic Semantic Scene Analysis
Subject Area
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Term
from 2017 to 2022
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 381855581
Challenge: Research on object detection and recognition, computation of 3D geometric models from sensing, and mapping and localization is maturing. We propose to build on these capabilities to create highly scalable, computationally efficient, 3D semantic representations of an environment that would enable advances in multiple application areas such a robotics and augmented reality. We address two specific challenges: 1) development of a computa-tionally efficient semantic scene mapping framework based on objects and inter-relationships, and 2) extension of this representation to support efficient reasoning about changes in the environment between scene encounters (episodes) over time. The end-goals of this project are 1) a conceptual framework for episodic semantic scene analysis (ESSA) and corresponding 2) a modular software framework for indoor perception systems that operates at the level of scene structural elements and objects, which is highly scalable in terms of both reconstructing large scenes (spatially) and understanding scene dynamics (temporally and semantically). Approach: We propose to develop a hierarchical scene model that integrates state-of-the-art perception algorithms for joint 3D object detection with real-time geometry mapping and spatial localization. The model will encompass both geometric elements (e.g. planar surfac-es), objects (e.g. chairs or tables), and constellations of objects (e.g. a place setting), and properties among them (e.g. a place setting on a table). These representations will be main-tained across scene encounters, allowing for rapid re-grounding of the model between en-counters, as well as efficient modification of the model due to detected changes. At the same time, communication between the semantic and the geometric layers of our framework can improve the inferred scene structure as well as scene understanding. We will develop ESSA on the context of use-cases from robotic manipulation and augmented reality. DFG impact & expected outcomes: The result of this project will be contributions to both the science and practice of building software frameworks that support a broad range of perception-based applications in human-computer interaction and robotics. During the course of this
DFG Programme
Research Grants