Project Details
Relating Probabilities of Words to Probabilities of Worlds
Applicant
Dr. Sean Papay
Subject Area
Methods in Artificial Intelligence and Machine Learning
Applied Linguistics, Computational Linguistics
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Applied Linguistics, Computational Linguistics
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Term
since 2026
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 579381360
Large language models (LLMs) generate text by defining and sampling from a probability distribution over strings. In modeling this distribution, they acquire not only linguistic knowledge but also world knowledge, which benefits them both in autoregressive next-token prediction and in downstream tasks to which language models are applied. Although this world knowledge is vital to LLMs’ performance, we cannot observe it directly; we can only infer its properties from generated strings and their probabilities. In this project, we propose interpreting this world knowledge as a latent distribution over semantic world states that underlies the string distribution, and investigating the properties of this world distribution. Concretely, this will involve probing model probabilities for propositions conditioned on premises, using natural-language descriptions. Such an investigation will serve three major purposes: (1) to better explain models’ behavior in terms of world beliefs, (2) to improve downstream applications of LLMs by decoupling semantic beliefs from surface realizations, and (3) to develop general-purpose probability estimation models for use in cognitive modeling. Over the course of this project, we will address five major research questions: 1) How can we extract semantic probabilities from LLMs? 2) How do the extracted probabilities correspond to empirical probabilities? 3) Are extracted probabilities consistent with one another? 4) How do extracted probabilities relate to human judgments? 5) Can we reconstruct consistent belief states by augmenting LLMs with additional structure? We will answer these questions experimentally, relying on experimentation with existing LLMs and a human annotation project to elicit probability judgments. This work will provide a better frame of reference for explaining LLM behavior, tools for directly extracting semantic beliefs for downstream tasks, and general-purpose probabilistic world models for use in cognitive modeling.
DFG Programme
Priority Programmes
Co-Investigator
Professor Dr. Roman Klinger
