Project Details
Projekt Print View

Crosslingual Language Varieties: A Multifaceted Investigation

Subject Area General and Comparative Linguistics, Experimental Linguistics, Typology, Non-European Languages
Term from 2018 to 2023
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 398186468
 
Final Report Year 2024

Final Report Abstract

The research project Crosslingual Language Varieties was motivated by the ubiquity of bilingualism and multilingualism in today’s world, and aimed to advance our knowledge of the diversity of language varieties used by speakers of one, two, or more languages. We posed several research questions that investigated the similarities and differences among various language varieties (including native language, learner language, advanced nonnative language and translation), and proposed to use three different research methods that complement and inform each other: computational analysis with large corpora; deep linguistic analysis with smaller, tightly-controlled corpora; and psycholinguistic experiments. To address these questions, we put a great emphasis on the construction of infrastructure, namely corpora, annotation schemes and other language resources. Another point of attention was the justification of the actual theoretical foundation of the predominant native speaker comparison paradigm. Language resources developed as part of the project include the Hebrew Learner Corpus, with a detailed annotation scheme of target hypotheses; and a subset of the Falko corpus with a deep annotation of reflexivity, based on an annotation scheme that was developed for this project. The main research contributions of the project address the differences between native and learner language, taking into account and using possible first language (L1) influence as a potential explanation for such differences. In one study which was based on the Hebrew corpus data introduced above, we aimed at the identification of non-native essay texts, attribution of the L1 of the learners, and prediction of their proficiency level. For this, we trained a model that sought the most influential linguistic characteristics and their most rewarding combination that would perform the desired classifications. In another study we used computational methods to check the tendency of English learners to prefer words stemming from their own L1 language family over etymologically different synonyms. We confirmed that such tendency indeed is evident in learners’ essays, and that it declines with rising proficiency. Yet another contribution uses a small, deeply annotated corpus of German as a foreign language to explore the underuse of reflexives in learner essays, finding that learners cope surprisingly well with patterns that require learning by rote. The insights gained from the project are mainly in the area of second language acquisition and different kinds characteristics of L1 influence on an L2. The discussion of different methodological approaches to crosslinguistic influence were especially fruitful when it came to the setup of infrastructure, but also considering the evaluation of research results in the light of statistic robustness and linguistic adequacy.

Publications

 
 

Additional Information

Textvergrößerung und Kontrastanpassung