Project Details
Projekt Print View

A Computational Exploration of Construction Learning

Applicant Leonie Weißweiler
Subject Area Applied Linguistics, Computational Linguistics
Term since 2024
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 556596361
 
What are the ways in which we can study language? The easiest way is introspection: consulting our native speaker intuition about sentences that are acceptable or not, and then attempt to construct a system which explains those intuitions. To be more data-driven, we can consult corpora of sentences that have been uttered and attempt to build systems that predict these sentences and no others. In this way, we can attempt to reverse engineer the abstractions in the human mind that have produced these sentences, even though we cannot access them directly. But a limitation of corpus methods is that they are fundamentally observational and not causal. Experimental linguistics makes it possible to directly test hypotheses about language structure and learning by manipulating variables. However, this approach is also limited to several items, in a controlled lab setting. More importantly, this method can only make modifications by adding words to the sum of language data that a participant has already seen in their life. More powerful causal inferences that would be enabled by modifying something in the total language input of a human and observing the result are prohibited by human rights. We cannot, for example, ensure that a child is never given negative feedback on any utterance. Language models (LMs) now offer a powerful new addition to this toolkit, overcoming some of the limitations of earlier methods. LMs seem capable of learning complex grammatical structure from data, even with little built-in linguistic knowledge. But, crucially, they are also amenable to direct causal intervention, as we can manipulate their training data and directly measure how learning is affected. Thus, I propose to use language models as proxies for human learning, to form hypotheses about the processes of language learning that would be impossible to validate on humans directly. Specifically, I will use the recently introduced BabyLM corpus, created to emulate the amount and type of total language input received by an average 13-year-old. Throughout this work, I will be careful to consider the ways in which LM learning can differ from human learning. The viability of this has already been shown by Misra and Mahowald, who found that learning of the AANN construction depends on the presence of sentences compatible with phenomena that share core properties with it. Working with Prof. Kyle Mahowald at UT Austin, I will use this approach to test a variety of theoretically informed questions about the learning of rare syntactic constructions, specifically focusing on what characteristics of the input are necessary for learning ot occur. In Year 2, I will work with Prof. Reut Tsarfaty at Bar Ilan University on extending this approach to constructions across a variety of typologically diverse languages, thus importantly expanding the generality of the conclusions we can draw as well as expanding the scope of languages considered in this area of computational linguistics.
DFG Programme WBP Fellowship
International Connection Israel, USA
 
 

Additional Information

Textvergrößerung und Kontrastanpassung