Detailseite
Mathematical methods and algorithms for learning effective embeddings of semi-structured information for anomaly detection problems
Antragsteller
Professor Dr. Martin Spindler
Fachliche Zuordnung
Statistik und Ökonometrie
Bild- und Sprachverarbeitung, Computergraphik und Visualisierung, Human Computer Interaction, Ubiquitous und Wearable Computing
Bild- und Sprachverarbeitung, Computergraphik und Visualisierung, Human Computer Interaction, Ubiquitous und Wearable Computing
Förderung
Förderung seit 2021
Projektkennung
Deutsche Forschungsgemeinschaft (DFG) - Projektnummer 448795504
The rise of digitization leads to the availability of huge and novel data sets which are often semi-structured. Although the analysis of such data sets is challenging, it offers great opportunities for researchers. The goal of the project is to develop models for better anomaly detection on the base of those semi-structured data. Health care industry provides challenges related to important applications like fraud detection, recommendation systems and decision support systems. These challenges can be solved with learning from collected data. Economic and financial (time series) industry also require outlier and novelty detection as an important first step in processing time series data.In those domains it is of vital importance to detect anomalies and outliers, as they have a high relevance. For example, the case of fraudulent claims, which usually differ considerably from default claims, shall be detected. In clinical / medical decision support systems unusual cases which need special treatment should be filtered out. For economic and financial data it is very important to perform outlier and change detection in an automatic way. The goal of this project is to develop Deep Learning and Machine Learning methods for anomaly and outlier detection and apply them to the tasks mentioned above, namely fraud detection in insurance and outlier detection in financial time series. These will be possible as all the tasks above share the type of input data related to important problems in healthcare, economics and financial areas: they are sequences of various length, so they belong to semi-structured datasets. The project consists of three parts. First, development of efficient deep representations and embeddings of semi-structured information such as graphs and sequences. Doing this, we will construct efficient semantic-level similarity measures, which will allow us to establish what is the norm to detect anomaly. Second, we will develop effective end-to-end learnable approaches to anomaly detection and imbalanced classification for semi-structured information. Third, we'll develop problem-oriented data mining approaches for fraud detection, outlier detection in (financial) time series, recommendation systems and decision support systems with applicationsin health care, insurance, finance and economics.To sum up, the final goal of this proposal is to enable effective representations of semi-structured information and develop end-to-end approaches for anomaly detection, that are ready to use for the solution of real-world applied problems.
DFG-Verfahren
Sachbeihilfen
Internationaler Bezug
Russische Föderation
Partnerorganisation
Russian Foundation for Basic Research, bis 3/2022
Kooperationspartner
Professor Dr. Evgeny Burnaev, bis 3/2022