Project Details
Projekt Print View

KIND-LM: Cognitively-inspired interaction dynamics for sample-efficient language modeling

Subject Area Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Developmental and Educational Psychology
Term since 2026
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 579171395
 
Computational models of language can generate remarkably fluent text, but their impressive performance comes at the cost of training on trillions of tokens with unsustainable computational resources. When trained under academic resource constraints, such models fall short of robust linguistic generalization and often fail to adapt to unseen contexts. Human learners, by contrast, acquire language from vastly smaller input and can flexibly adapt to new communicative situations from an early age. A central difference lies in the learning signal: while human acquisition is embedded in rich social interactions, language models are typically optimized for the narrow task of next-word prediction. This project develops a cognitively grounded approach for interactive language modeling that integrates feedback mechanisms inspired by child–caregiver communication. We propose a training setup in which a child model improves its linguistic competence through interaction with a more powerful parent model. Unlike existing teacher–student approaches, which assume unilateral feedback, we focus on the temporal and linguistic interaction dynamics and on the interaction initiative. We build on our winning submission to the new interaction track of the BabyLM Challenge, which used a reinforcement loop and showed that even simplified feedback strategies can enhance functional linguistic competence without sacrificing formal accuracy. We propose to better align computational modeling with psycholinguistic evidence and systematically test cognitively more plausible interaction strategies. We will draw on mechanistic interpretability methods to better understand how interaction dynamics influence the representational structure of the model and how they can improve its ability to generalize to the long tail of the vocabulary distribution. Our project advances research on cognitively inspired sample-efficient modeling and contributes to the Priority Programme LaSTing by using language technology as a simulation framework to deepen our understanding of human language learning.
DFG Programme Priority Programmes
International Connection Netherlands
Cooperation Partner Professorin Arianna Bisazza
 
 

Additional Information

Textvergrößerung und Kontrastanpassung