Modellbildung aus Experimentaldaten: Maschinelles Lernen und Modellevaluierung unter Abhängigkeiten und Verteilungsverschiebungen
Final Report Abstract
The project studies and develops machine learning methods for data characterized by specific distributional properties. Traditionally, most existing ma-chine learning approaches assume that data points are independent samples from a fixed distribution (known as i.i.d. for independently drawn from identical distributions). Due to specific observation protocols when collecting empirical data, such data can however violate this assumption in several ways. Within the project, we are focusing on two specific distributional properties: (1) dependencies between data points, which arise from individual effects of, for example, test subjects, and (2) distribution shifts within the data, which are for example caused by taking measurements at different locations. To investigate these phenomena, the project focuses on two application domains for which these distributional properties are characteristic: (1) eye movement data in psychology, which are strongly influenced by individual effects, and (2) ground motion data in seismic risk analysis, in which spatial distribution shifts are caused by different measurement locations. Central results of the project in the area of individual effects are models for characterizing individual distributions in sequence data, including fully probabilistic models, combinations of probabilistic models and neural networks, and metric learning models. On the application side, we were able to show that eye movement patterns are highly individual and can therefore also be used for biometric identification of subjects. Compared to existing approaches from the literature, we have substantially increased the identification accuracy. Central results of the project in the area of distribution shifts are models that represent continuous spatial distribution shifts in data. Here, a Gaussian process describes the spatial change of model parameters, which in turn describe the relationship between inputs (e.g., earthquake attributes) and outputs (e.g. ground motion). On the application side, we were able to show that such models deliver substantially more accurate predictions of ground motion than i.i.d.-models.
Publications
-
A Model of Individual Differences in Gaze Control During Reading. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (2014), 1810-1815. Association for Computational Linguistics.
Landwehr, Niels; Arzt, Sebastian; Scheffer, Tobias & Kliegl, Reinhold
-
A Nonergodic Ground‐Motion Model for California with Spatially Varying Coefficients. Bulletin of the Seismological Society of America, 106(6), 2574-2583.
Landwehr, Niels; Kuehn, Nicolas M.; Scheffer, Tobias & Abrahamson, Norman
-
A Semiparametric Model for Bayesian Reader Identification. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (2016), 585-594. Association for Computational Linguistics.
Abdelwahab, Ahmed; Kliegl, Reinhold & Landwehr, Niels
-
Varying-coefficient models for geospatial transfer learning. Machine Learning, 106(9-10), 1419-1440.
Bussas, Matthias; Sawade, Christoph; Kühn, Nicolas; Scheffer, Tobias & Landwehr, Niels
-
A Discriminative Model for Identifying Readers and Assessing Text Comprehension from Eye Movements. Lecture Notes in Computer Science (2019), 209-225. Springer International Publishing.
Makowski, Silvia; Jäger, Lena A.; Abdelwahab, Ahmed; Landwehr, Niels & Scheffer, Tobias
-
Detecting Autism by Analyzing a Simulated Social Interaction. Lecture Notes in Computer Science (2019), 193-208. Springer International Publishing.
Drimalla, Hanna; Landwehr, Niels; Baskow, Irina; Behnia, Behnoush; Roepke, Stefan; Dziobek, Isabel & Scheffer, Tobias
-
Probabilistic Seismic Hazard Analysis in California Using Nonergodic Ground Motion Models. Bulletin of the Seismological Society of America, 109(4), 1235-1249.
Abrahamson, Norman; Kuehn, Nicolas; Walling, Melanie & Landwehr, Niels
-
How the Selection of Training Data and Modeling Approach Affects the Estimation of Ammonia Emissions from a Naturally Ventilated Dairy Barn—Classical Statistics versus Machine Learning. Sustainability, 12(3), 1030.
Hempel, Sabrina; Adolphs, Julian; Landwehr, Niels; Janke, David & Amon, Thomas
-
Quantile Layers: Statistical Aggregation in Deep Neural Networks for Eye Movement Biometrics. Lecture Notes in Computer Science (2020), 332-348. Springer International Publishing.
Abdelwahab, Ahmed & Landwehr, Niels
-
Supervised Machine Learning to Assess Methane Emissions of a Dairy Building with Natural Ventilation. Applied Sciences, 10(19), 6938.
Hempel, Sabrina; Adolphs, Julian; Landwehr, Niels; Willink, Dilya; Janke, David & Amon, Thomas
-
Towards the automatic detection of social biomarkers in autism spectrum disorder: introducing the simulated interaction task (SIT). npj Digital Medicine, 3(1).
Drimalla, Hanna; Scheffer, Tobias; Landwehr, Niels; Baskow, Irina; Roepke, Stefan; Behnia, Behnoush & Dziobek, Isabel
-
Deep Distributional Sequence Embeddings Based on a Wasserstein Loss. Neural Processing Letters, 54(5), 3749-3769.
Abdelwahab, Ahmed & Landwehr, Niels