Random Matrix Approaches to Approximate Bayesian Inference in Machine Learning
Electronic Semiconductors, Components and Circuits, Integrated Systems, Sensor Technology, Theoretical Electrical Engineering
Statistical Physics, Nonlinear Dynamics, Complex Systems, Soft and Fluid Matter, Biological Physics
Final Report Abstract
A common task in fields such as machine learning, information theory and signal processing is to learn from large datasets to make efficient and accurate predictions on novel data. Methods to achieve these goals often rely on Bayesian statistical models which combine prior knowledge and observed data into probability distributions that allow for the computation of likely predictions and to address the degree of uncertainty involved by them. However, in practice, exact computations with large Bayesian models become intractable and one has to resort to methods of approximate inference. Often, such methods express approximations to the desired predictions in terms of solutions of large sets of coupled nonlinear equations. The development of fast and reliable numerical algorithms for solving them, the theoretical study of their performance and the quality of the approximations is an active and important topic of research. In this project we have focussed on a class of algorithms known as AMP (approximate message passing) style in the important limit when the number of ’nodes’ in the algorithm is very large. In this limit, global properties become independent of single data instances and exact results are possible when we make specific statistical assumptions on the randomness of data matrices. Using techniques of statistical physics we have been able to derive equations for the stochastic behaviour of each single node in the algorithm. Unfortunately, for general algorithms, even the single node dynamics can be too complicated to derive useful results. This is due to the occurrence of effective interactions between a node variable at different time steps. Surprisingly, we have found that such ’memory’ interactions disappear for a variety of AMP style algorithms. We have been able to relate this property to results of random matrix theory. We have thus obtained exact analytical results for the stability and speed of convergence of AMP style algorithms for known models and also developed new AMP algorithms. We have shown that the statistical assumptions on the randomness of matrices can be weakend suggesting relevance to more realistic scenarios. Although our approach relied on methods of statistical physics, our new results have inspired further research in the information theory and applied mathematics communities leading to a variety of rigorous proofs. In a collaboration with Professor Yue M Lu (Harvard University), we have also developed a novel approach for such a rigorous analysis of algorithms which we have applied to a classification model of machine learning with multiple outputs. Currently Dr. Burak Cakmak, who has worked on the project as a postdoc continues his research which he now applies to practically relevant and complex algorithms within the field of telecommunications.
Publications
-
A dynamical mean-field theory for learning in restricted Boltzmann machines. Journal of Statistical Mechanics: Theory and Experiment, 2020(10), 103303.
Çakmak, Burak & Opper, Manfred
-
Understanding the Dynamics of Message Passing Algorithms: A Free Probability Heuristics. Acta Physica Polonica B, 51(7), 1673.
Opper, M. & Çakmak, B.
-
Exact solution to the random sequential dynamics of a message passing algorithm. Physical Review E, 103(3).
Çakmak, Burak & Opper, Manfred
-
Analysis of Bayesian inference algorithms by the dynamical functional approach. Journal of Physics A: Mathematical and Theoretical, 53(27), 274001.
Çakmak, Burak & Opper, Manfred
-
Analysis of random sequential message passing algorithms for approximate inference. Journal of Statistical Mechanics: Theory and Experiment, 2022(7), 073401.
Çakmak, Burak; Lu, Yue M. & Opper, Manfred
-
A Convergence Analysis of Approximate Message Passing with Non-Separable Functions and Applications to Multi-Class Classification. 2024 IEEE International Symposium on Information Theory (ISIT), 747-752. IEEE.
Çakmak, Burak; Lu, Yue M. & Opper, Manfred
