Project Details
Projekt Print View

Visual Analytics of Complex Event Sequence Data

Subject Area Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Term since 2021
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 449742818
 
Event sequences model an ordered series of events that have occurred over a period of time. Such sequences play an important role in a wide range of applications. For instance, electronic health records (EHRs) contain time-stamped medical events (e.g., diagnoses, lab tests) for patients recorded over the course of a clinical process. As another example, a typical software development process includes a series of events such as commits to a repository. Similarly, the empirical evaluation of graphical interfaces may record user behaviors such as gazing, typing, or mouse clicking, which can also be modeled by event sequences. In this project, we aim to develop novel analysis and visualization techniques to support users in obtaining insights into event sequence data in different application domains. How can we discover event patterns from a large collection of event sequences so that people can find the underlying rules to help interpret causality within a sequence and even predict future events? Unfortunately, real-world event sequences are usually large in scale, diverse in event type, vary in sequence length, and events may occur in different orders and last for different durations, thus making the summarization of event patterns difficult. We plan to comprehensively analyze complex event sequence data by creating a novel visual analytics method that adapts and extends data mining and machine learning techniques, integrating them into an interactive visual analysis pipeline.Specifically, we will first develop techniques for preprocessing and cleansing complex sequence data in order to estimate the importance of events, split long sequences into segments based on the underlying semantics, align sequences of different lengths, and remove noise from input data. After that, we will develop a knowledge-oriented representation learning technique that transforms various types of events into a uniform feature representation based on their semantic relations. Utilizing the resulting feature representations, we will then focus on visual analytics techniques for multiscale sequence summarization, event causality analysis, event prediction, and anomaly detection. We will integrate the data preprocessing, analysis, and visualization techniques into a visual analytics toolkit for event sequence data. Finally, we will apply our techniques to a wide range of important scenarios: novel data analysis methods for EHRs, empirical user studies, and event sequences in software engineering. Overall, we expect novel results for basic research that will advance visual analytics of sequence data as well as practical improvements for the three application examples.
DFG Programme Research Grants
International Connection China
Cooperation Partner Professor Dr. Nan Cao
 
 

Additional Information

Textvergrößerung und Kontrastanpassung