Project Details
Projekt Print View

Propositional Attitudes in Large Language Models (PALLM)

Subject Area General and Comparative Linguistics, Experimental Linguistics, Typology, Non-European Languages
Term since 2026
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 579297085
 
Decades of interdisciplinary research on the meanings of propositional attitude verbs like “believe” and “want” have furnished deep insights into how humans use language to describe mental states. This project will be the first to study how large language models (LLMs) use such attitudes, comparing LLMs to both humans and each other. Understanding LLM use of attitude verbs has the potential to inform future research in both AI and linguistics. From an AI perspective, attitudes have quietly played a crucial role in several areas of research pertaining to LLMs, such as LLMs’ purported theory of mind capabilities, as well as their use for intent classification. Moreover, many LLM prompts used in leading AI products make critical use of attitude verbs to govern the behavior of AI agents, e.g., by prompting them to act based on their own beliefs or the perceived desires of their human interlocutors. Thus, understanding how LLMs interpret attitudes is also potentially important to AI safety and alignment. On the linguistics side, seeing where LLMs succeed and fail in interpreting attitudes can give us important information about what is necessary for language acquisition. Attitudes are especially interesting because of the unique acquisition challenge they pose: “believe” and “want” do not have the same obvious physical correlates as e.g. “clap”, so humans are particularly reliant on distributional information to glean their interpretations, something that bears a partial resemblance to the statistical learning methods used to train LLMs. And if LLMs fail to emulate human language use, then this suggests that something more than sheer volume of training data is required. Comparing across LLMs is valuable here as well, as correlating differences in performance with differences in model size, architecture, and training can also provide helpful insights. On an empirical level, the project will focus on three topics pertaining to propositional attitudes. The first will be entailments in their complements: when does “X wants J” logically entail “X wants K”, and likewise for “believe”, etc.? The second will be attitude measurement constructions: what does it mean if “X wants J more than Y wants K”? The third will be cases in which the truth conditions of an attitude ascription are affected by the syntactic or pragmatic environment in which it occurs (so-called “restricted readings”). For each empirical domain, the project will feature a combination of theoretical linguistic research, human psycholinguistic research (for those phenomena where the empirical picture is not yet settled), and AI research. In addition to this purely scientific work, the project will also include the development of new open source software for linguistic research on LLMs, as well as the creation of a new benchmark dataset to evaluate current and future LLMs.
DFG Programme Priority Programmes
 
 

Additional Information

Textvergrößerung und Kontrastanpassung