Event-centric Exploration and Analysis of Linked Open Data

Applicants Professor Dr. Michael Gertz; Professor Dr.-Ing. Kai-Uwe Sattler
Subject Area Security and Dependability, Operating-, Communication- and Distributed Systems
Term from 2014 to 2018
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 249438185
 

Project Description

Time and geographic information has become ubiquitous, ranging from geotagged tweets to elaborate description of phenomena in textual documents. Time and geographic information in particular provides the basis for events, that is, descriptions of something that happens at a particular point in time and location, including entities such as persons and organizations involved in events. The need for event-based exploration of data, represented as linked open data (LOD) or traditional document repositories, becomes more and more urgent in many disciplines, such as the humanities, history, and medicine. In this project, we aim at developing a novel and comprehensive framework in support of diverse search and exploration tasks of event data that are to be extracted from heterogeneous resources. As a schema for event descriptions, we will employ and combine existing event ontologies with a focus on the description of time and geographic information, exploiting the well-defined semantics of time and space. The event schema serves as the basis for different extraction and integration tasks that will be developed in the course of the project. Sources include existing LOD repositories, such as YAGO2 and DBpedia, event-related sources, such the Website eventful.com, and traditional textual sources such as Wikipedia. For our event extraction approaches, we put a particular emphasis on text documents from which event descriptions are extracted using event-centric information extraction approaches. The extraction and integration of event data includes different mapping and normalization techniques to provide a consistent and uniform basis for different event search and exploration tasks. A core novelty of the proposed approaches includes the development, implementation and evaluation of event-correlation operators. The operators, targeted towards the efficient processing of event data represented as RDF, aim at detecting correlations between events. Basis for correlations and thus different event similarity measures are temporal and spatial properties of events. Supported by a comprehensive and partially automated event extraction, integration and processing pipeline, this project will (1) design, implement, and populate a comprehensive event repository that is interlinked with other LOD sources and (2) provide users with rich event search and exploration functionality. These include searching for similar events, given a specific event, derivation of event trajectories for persons, and correlation events based on temporal, spatial, and contextual information. The event repository as well as the processing pipeline including event correlation operators will be made available to the research community and public.
DFG Programme Research Grants