Project Details
Projekt Print View

Local Dependence in Large Event Data

Subject Area Statistics and Econometrics
Term from 2023 to 2024
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 520770522
 
Final Report Year 2025

Final Report Abstract

In today’s interconnected world, digital platforms generate an unprecedented amount of network data, capturing social interactions such as messages and emails. These interactions form networks that evolve over time, providing valuable insights into communication patterns, behavior, and structural dependencies. However, traditional statistical models struggle to handle large-scale network data effectively due to assumptions about global dependencies and computational limitations. This project tackles these challenges by developing scalable statistical models for large-scale network data. The focus shifted from pure event data to general network data without temporal information due to a lack of basis of models for large static networks. This work serves as the basis for future extensions to the temporal domain. The research, thereby, introduces innovative methods based on local dependence, which assumes that units in a network are primarily aware of their local neighborhoods rather than the global network. Three main approaches guide the project: 1. Non-Overlapping Neighborhoods: Events are analyzed within distinct, isolated clusters of actors. 2. Domain-Driven Overlapping Neighborhoods: Actor interactions are driven by overlapping social contexts, such as shared affiliations or common partners. 3. Latent Social Spaces: Actor relationships are represented in a hidden space where proximity reflects the likelihood of interaction. The third approach was not yet realized. These models are theoretically robust and computationally scalable, enabling efficient analysis of large networks. Practical applications range from information diffusion on social media to dependency networks between open-source software packages. At the same time, the project emphasizes reproducibility and accessibility, leading to the release of the software package bigergm for the analysis of big networks. This ensures that researchers and policymakers can apply these tools to real-world problems. This project bridges the gap between theoretical statistical modeling and practical large-scale event data analysis, offering tools to make sense of complex, dynamic networks in today’s data-driven world.

Publications

 
 

Additional Information

Textvergrößerung und Kontrastanpassung