S3-GEP: Scalable Spatiotemporal Statistics for Global Environmental Phenomena
Final Report Abstract
Today’s availability of large amounts of Earth observation data allows for complex analysis of environmental phenomena on a global scale. Satellite-based observations yield continuous global time series of the Earth’s surface and atmosphere. However, the practical application of complex statistical analysis of these data quickly results in methodological and technical difficulties. First, datasets are typically large in volume and have a complex structure (overlapping images, different pixel sizes and spatial reference systems, etc.). Second, common geostatistical models are computationally complex and have strict assumptions on variables that are not met on a global scale. As a result, researchers must invest a large amount of work into preprocessing and the potential of the data is not fully utilized. The project “scalable spatiotemporal statistics for global environmental phenomena” studied new ways to facilitate and scale-up the use of large multitemporal Earth observation data in statistical modeling. To facilitate the management of large satellite image time series, a new data representation as on-demand data cubes has been developed. Such data cubes allow for a scalable and more interactive use of large datasets while providing the flexibility to vary spatiotemporal resolution and/or study area. To make geostatistical analysis possible on such datasets on a global scale, spatiotemporal multiresolution approximations were developed. They allow for balancing computation times against prediction accuracy while being applicable in distributed computing environments. In a study on simulated data, we could considerably speed up typical interpolation tasks (speedup factor > 100) while increasing prediction errors (RMSE) by only 6.6% compared to traditional Kriging. To relax model assumptions on a global scale, we could integrate a kernel-convolution approach with spatially and/or temporally varying parameters of the covariance function. The project has furthermore taken up the impressive development of deep learning methods during the last years. We performed a study on how artificial neural network models based on partial convolutions can be used to fill gaps in atmospheric measurements. Results have not only shown promising computational properties and efficiency but also that such models are well suited for dynamic modeling in nowcasting applications.Methodological results have been published in articles and as open-source software. To demonstrate practical applications, study cases on different environmental variables (e.g. sea surface temperature, atmospheric carbon monoxide, land surface temperature) have been performed.
Publications
-
On-Demand Processing of Data Cubes from Satellite Image Collections with the gdalcubes Library. Data, 4(3), 92.
Appel, Marius & Pebesma, Edzer
-
Processing Large Satellite Image Collections as Data Cubes with the gdalcubes R package. OpenGeoHub Summer School, Münster, Germany, Aug 02-06
Appel, Marius
-
Analyzing Multi-Variable Earth Observation Data Cubes. Geospatial Sensing | Virtual 2020, Aug 31- Sep 2
Appel, Marius
-
Creating and Analyzing Multi-Variable Earth Observation Data Cubes in R. OpenGeoHub Summer School, Wageningen, The Netherlands, Aug 17-21
Appel, Marius
-
Spatiotemporal multi-resolution approximations for analyzing global environmental data. Spatial Statistics, 38, 100465.
Appel, Marius & Pebesma, Edzer
-
Analyzing massive amounts of EO data in the cloud with R, gdalcubes, and STAC. OpenGeoHub Summer School, online, Sept 1-3, 2021.
Appel, Marius
-
Implementation of geostatistical models for large spatiotemporal datasets using multi-resolution approximations. Copernicus GmbH.
Appel, Marius & Pebesma, Edzer
-
Multi-resolution approximations to map global phenomena from massive datasets. Symposium: Statistical approaches for analyzing Remote Sensing imagery, online, Jul 23
Appel, Marius
-
Training and validation data for artificial neural networks using three-dimensional partial convolutions to fill gaps in satellite image time series (v0.1.0)
Appel, Marius
-
Efficient Data-Driven Gap Filling of Satellite Image Time Series Using Deep Neural Networks with Partial Convolutions. Artificial Intelligence for the Earth Systems, 3(2).
Appel, Marius
