Project Details
Projekt Print View

Modellbildung aus Experimentaldaten: Maschinelles Lernen und Modellevaluierung unter Abhängigkeiten und Verteilungsverschiebungen

Subject Area Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Term from 2012 to 2023
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 225197905
 
Final Report Year 2023

Final Report Abstract

The project studies and develops machine learning methods for data characterized by specific distributional properties. Traditionally, most existing ma-chine learning approaches assume that data points are independent samples from a fixed distribution (known as i.i.d. for independently drawn from identical distributions). Due to specific observation protocols when collecting empirical data, such data can however violate this assumption in several ways. Within the project, we are focusing on two specific distributional properties: (1) dependencies between data points, which arise from individual effects of, for example, test subjects, and (2) distribution shifts within the data, which are for example caused by taking measurements at different locations. To investigate these phenomena, the project focuses on two application domains for which these distributional properties are characteristic: (1) eye movement data in psychology, which are strongly influenced by individual effects, and (2) ground motion data in seismic risk analysis, in which spatial distribution shifts are caused by different measurement locations. Central results of the project in the area of individual effects are models for characterizing individual distributions in sequence data, including fully probabilistic models, combinations of probabilistic models and neural networks, and metric learning models. On the application side, we were able to show that eye movement patterns are highly individual and can therefore also be used for biometric identification of subjects. Compared to existing approaches from the literature, we have substantially increased the identification accuracy. Central results of the project in the area of distribution shifts are models that represent continuous spatial distribution shifts in data. Here, a Gaussian process describes the spatial change of model parameters, which in turn describe the relationship between inputs (e.g., earthquake attributes) and outputs (e.g. ground motion). On the application side, we were able to show that such models deliver substantially more accurate predictions of ground motion than i.i.d.-models.

Publications

 
 

Additional Information

Textvergrößerung und Kontrastanpassung