Project Details
Distributional Regression with Artificial Neural Networks
Applicant
Dr. Benjamin Säfken
Subject Area
Statistics and Econometrics
Term
since 2020
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 450330162
Data-driven digitization leads to an increasing availability of large unstructured and heterogeneous quantities and types of data relating, for example, to images or texts. These have hitherto been limited in their utilization potential, especially with regard to empirical analysis using conventional statistical methods, such as regression models. However, significant progress has been made in the area of machine learning regarding methods for the predictive modelling of these kinds of data types, especially in the field of deep learning. In contrast to advanced methods of distributional regression, the focus of these methods is often on prediction and less on uncertainty modelling. In addition, the complex structure of the models leads to a lack of explainability.The aim of this project is to utilize neural networks underlying deep learning for the purpose of inference in structured additive distributional regression models.The aim of synthesizing the two models is to enable an explicit probability-based quantification of the dependent variable. In contrast to the frequent focus on the prediction of the mean, this allows for a comprehensive modelling of the distribution of the dependent variable. By including extended models of distributional regression such as bivariate dependent variables based on copulas, this research approach produces an increase in the complexity of the structure of the dependent variable while maintaining the complex structure of the explanatory quantities that characterize deep neural networks. Effect-specific interpretable parameters also enable the integration of a component into the deep neural network that is understandable for humans. In addition, the quality of the specified model can be assessed using inference methods and (asymptotic) distributional assumptions on the estimated parameters. From the perspective of structured additive distributional regression, the use of high-performance algorithms for the optimization of deep neural networks also enables such models to be estimated for larger data sets than has been the case with the current implementations. Furthermore, with the representability of arbitrarily complex functions in neural networks, the predefined additive structure of the predictor can be dropped in structured additive distributional regression, and more complex functional relationships and interactions can be represented in the predictor.The targeted research findings thus provide a promising starting point for future research in using artificial intelligence for statistical analysis and can contribute towards the comprehensibility and explainability of deep neural networks. At the same time, this project opens up possibilities to analyse more complex data types such as images in empirical research.
DFG Programme
Research Grants