Project Details
Projekt Print View

Limits and Biases in Machine and Human Language and its Learning

Subject Area General and Comparative Linguistics, Experimental Linguistics, Typology, Non-European Languages
Term since 2026
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 579375133
 
Despite their human-like appearance in some tasks, language models (LMs) differ substantially from humans in their structure and the structure of their training. The Libille project aims to improve our understanding of these structural differences and their effects. To do so it formulates predictions and then attempts to validate predictions concerning differences and similarities in human and LM behaviour focussing on the morphosyntactic domain. The project explores both the mature state, i.e. adult humans vs. trained LLMs, as well as the learning/training stages of human and LLMs. Furthermore it applies structural probing to assure the explanations of LM behavior are the predicted ones. Central features of LM design are the tokenizer, embeddings, the transformer architecture, and gradient descent learning. From this architecture, we argue a number of different behavior in morphosyntax in the mature state are predicted. Three we will focsu on are those concern morphological paradigm gaps, morphological generalization to invented ('nonce') words, sensitivities to a languages writing conventions for example with definite markers and clitic pronouns. The project will devise novel experiments to test LMs and humans for such predicted differences. The LM architecture also predicts that during training, LMs should exhibit different properties from human children acquiring language. We develop different paradigms to test for the predicted learning differences using in-context learning, artificial grammar learning, and other techniques. The three predictions we focus on are the absence of the typical stages of human language acquisition in LMs, an LM ability to learn generalizations impossible in human languages, and that training to induce human-like learning biases should improve LMs learning. The comprehensive understand of human-machine difference in linguistic abilites that the LIBILLE project develops will provide crucial insights to our understanding of both the human language ability and the abilities of current LMs.
DFG Programme Priority Programmes
 
 

Additional Information

Textvergrößerung und Kontrastanpassung