Final Report Year
2023
Final Report Abstract
(1) Research Summary
The Japanese team has established several methods for applying signatures, which aggregate path information, to Earth Sciences. Specifically, because observational data in Earth Sciences often come in the form of paths, known as profiles, the team demonstrated that converting these profiles into signatures allows for effective time-series analysis, atmospheric state estimation, quality control of ocean profiles, and data assimilation, using real-world data. Through the consideration of practical applications to Earth Sciences, the team was able to jointly obtain insights with other teams on concepts such as the group mean of signatures, the inverse transformation from signatures back to paths, the consideration of the mean of Brownian motion, and the geodesic distance between signatures.
The French team has investigated applications of the signature method to data stream analysis exploring several research directions. First, the team has worked on the concept of mean of signatures, which is an important building block for further applications in machine learning. Second, clustering methods based on the use of iterated integrals have also been explored. Several practical applications have been tackled : finance, neurosciences and earth sciences. On these two subjects the French team has interacted with the German team which brings its algebraic knowledge and with the Japanese team bringing interesting applications. The German team has focussed on algebraic fundamentals: we have, for the first time, defined and studied multi-parameter sums signatures, have defined, studied and implemented iterated sums over arbitrary commutative semirings, and laid the theoretical foundation for the use of iterated sums in efficient transformer architectures. Moreover we have provided the first python package to use iterated sums for time series classification.
(2) Remarkable Research Achievements
1. Summary Description for 1st result
The signature method has become a powerful tool in the analysis of time series data. Its discrete iterated-sums analogue is particularly useful for the analysis of discrete time series data. It has been an open question for a while, whether similar techniques are applicable to multi-parameter objects, like images or videos. In the paper Two-parameter sums signatures and corresponding quasisymmetric functions (Joscha Diehl, Leonard Schmitz, under review at Journal of Algebra) we present a positive answer to this question, thereby opening up an entire new line of research.
2. Summary Description The classical signature method owes much of its success to the fact that it can be computed efficiently, using Chen's identity. This identity, which essentially says that a two-parameter object can be written in terms of a one-parameter object, has not been used to its full potential though. In unpublished work, we show that Chen's identity can be used in transformer networks to achieve a linear-in-input-size complexity when scanning over time intervals (instead of time points). The theoretical results are sound, and we are working on a successful implementation.
1. Summary Description for 1st result Signatures have been shown to act as a key component in machine learning for Earth sciences. In this field, time series and observational profiles are treated as multidimensional sequential data. By converting these into signatures, we've demonstrated that high-performance prediction can be achieved either with simple machine learning methods or in combination with other techniques. This effectiveness is
3
based on the mathematical principle that signatures can serve as a basis of functions on a set of paths. Strong evidence for these claims is provided in the paper by Sugiura (2022), as well as in papers currently under review [Fujita et al., 2023; Derot et al., 2023].
2. Summary Description for second result
The calculation of the mean is a key step in many machine learning methods. Several works have tackled this question of defining and computing in a proper way the mean of several time series. Unfortunately, in most of the cases, either computation of the average is highly time consuming or heavy assumptions as stationarity are required. In addition, up to our best knowledge, few methods address the multivariate setting. The aim of our work was to propose an approach based on the signature to average multidimensional time series Our contribution was to suggest to take advantage of the Lie group structure to average signatures, using the exp and log operation of the group [Mignot et al. 2022]
1. Summary Description The barycenter in free nilpotent Lie groups and its application to iterated-integrals signatures (J. Diehl, M. Clausel, R. Mignot, L. Schmitz, N. Sugiura and K. Usevich, arXiv:2305.18996, 2023, under review at SIAM Journal on Applied Algebra and Geometry). Here we established general results on the so-called signature mean, that we previously defined. We first establish the unique existence of the group mean and provide two algorithms, together with an implementation in SageMath and python, for calculating the empirical mean of a collection of time series. Our code is publicly available on https://github.com/Raph-AI/signaturemean
2. Summary Description Two-parameter sums signatures and corresponding quasisymmetric functions (J. Diehl, L. Schmitz, https://arxiv.org/abs/2210.14247, under review at Journal of Algebra). Here we expand from the realm of time series analysis by introducing a signature for twoparameter data, extending the applicability to image analysis. These are polynomial features that are invariant under time warping. We show that these two-parameter features are mathematically complete in a specific sense, and we set up the proper Hopf algebraic framework, including a weak form of Chen's identity.
3. Summary Description The moving frame method for iterated-integrals: orthogonal invariants (Joscha Diehl, Rosa Preiß, Michael Ruddy and Nikolas Tapia, Foundations of Computational Mathematics, 2022) Invariant elements of the signature are the main source of interpretable features, the ‘area’, which is invariant to rotations, being a prime example. In this work we apply Fels–Olver’s moving-frame method (for geometric features) paired with the log-signature transform (for robust features) to construct a set of integral invariants under rigid motions for curves in from the iterated-integrals signature.
Publications
-
Two-parameter sums signatures and corresponding quasisymmetric functions
J. Diehl & L. Schmitz
-
The Moving-Frame Method for the Iterated-Integrals Signature: Orthogonal Invariants. Foundations of Computational Mathematics, 23(4), 1273-1333.
Diehl, Joscha; Preiß, Rosa; Ruddy, Michael & Tapia, Nikolas
-
The Barycenter in Free Nilpotent Lie Groups and Its Application to Iterated-Integrals Signatures. SIAM Journal on Applied Algebra and Geometry, 8(3), 519-552.
Clausel, Marianne; Diehl, Joscha; Mignot, Raphael; Schmitz, Leonard; Sugiura, Nozomi & Usevich, Konstantin