The Epistemology of Statistical Learning Theory
Theoretical Computer Science
Final Report Abstract
Methods in machine learning have an ever-increasing impact in science and society. This calls for a better understanding, not only of the (societal, ethical, and otherwise) consequences of the use of machine learning methods in various domains, but also of these methods themselves. What explains their apparent success? In what sense and to what extent do they lead to reliable conclusions? What are their unavoidable pitfalls and limitations? These questions have to do with the nature and justification of uncertain or inductive inference: fundamental epistemological questions. Yet philosophers have engaged little with these questions in relation to the most prominent modern-day methods of inductive inference, algorithms in machine learning. In particular, formal philosophers have taken little advantage of existing mathematical frameworks for the analysis of learning algorithms. The standard such framework is statistical learning theory (SLT), and the main idea of this project was to employ the framework of SLT for an epistemological study of modern machine learning methods. The first part of the project focused on the fundamental limitations of inductive inference, and specifically the so-called no-free-lunch theorems in machine learning. The main question here was how exactly these skeptical results can co-exist with the positive theoretical guarantees that SLT offers. The project answers this question by spelling out how these general learning guarantees for standard algorithms are nevertheless relative to the restrictive model-assumptions or inductive bias which we need on each application to equip the algorithm with. The resulting notion of model-relative justification opens up a middle ground between (unfruitfully) resorting to skepticism and (implausibly) postulating some metaphysical principle of learning-friendliness of the world. Moreover, extending an important pragmatist strand in the modern philosophy of science, the project argues that this notion of justification assigns learning theory a natural normative role in a wider “forward-looking” epistemological account of inquiry with machine learning methods. The second part of the project concerned Occam’s razor, the philosophical principle that a preference for simplicity is conducive to good inductive reasoning. Here the question was whether SLT can offer some theoretical justification for this principle. Building on the work of the first part, the project spells out how SLT offers a qualified model-relative means-ends justification for Occam’s razor. The project has thus significantly advanced our epistemological understanding of SLT. In the meantime, much research in computer science has focused on the limitations of SLT to account for the success of modern algorithms like deep neural networks. The project’s results also provide the natural basis for a philosophical appraisal of this important search for a new theory.
Publications
-
The no-free-lunch theorems of supervised learning. Synthese, 199(3-4), 9979-10015.
Sterkenburg, Tom F. & Grünwald, Peter D.
-
On characterizations of learnability with computable learners. Proceedings of Machine Learning Research 178 (Conference on Learning Theory, COLT 2022): 3365-3379.
T.F. Sterkenburg
-
Peirce, Pedigree, Probability. Transactions of the Charles S. Peirce Society, 58(2).
Stewart, Rush T. Stewart
-
Review of John D. Norton’s The Material Theory of Induction - John D. Norton, The Material Theory of Induction. Calgary: University of Calgary Press, 680 pp. Philosophy of Science, 91(4), 1030-1033.
Sterkenburg, Tom F.
-
On Explaining the Success of Induction. The British Journal for the Philosophy of Science, 76(1), 75-93.
Sterkenburg, Tom F.
