Does morphosyntactic alignment shape discourse? Implementing a corpus-based approach to linguistic typology
Final Report Abstract
The project addressed the question of whether the morphological alignment of a language has any impact on the organization of connected discourse in that language. Specifically, we aimed at compiling and annotating a sample of spontaneous spoken language corpora, from typologically diverse languages with varying morphological alignments, and conducting cross-corpus quantitative analysis over this data set. For reasons beyond our control it was not possible to include a globally representative range of ergative languages due to the difficulties of obtaining access to naturalistic speech data from ergative languages from Australia. Despite the sparse and skewed sample of non-accusative languages at our disposal, we can provisionally report that there is no significant effect of morphological alignment on discourse structure, at least with regard to the metrics presented here. From a broader perspective, these findings suggest that discourse organisation is determined by robust and universal principles that are to some extent independent of the minutiae of cross-linguistic diversity in morphosyntax. Ergative alignment, where it is found, appears to be solely relevant for morphology, or at best within highly constrained and tight-knit domains of syntax, with little or no ramifications for discourse. These findings have considerable relevance for understanding the relationship of discourse to grammar, and the validity of emergentist approaches that see grammar as the crystallization of frequency distributions in discourse. Our results suggest that the rampant cross-linguistic variability in morphosyntax may arise through quite varied, and fairly random historical processes, as yet poorly understood, with a more opaque relationship to discourse than is commonly assumed within functionalist approaches to grammar. From a theoretical and methodological perspective, the project has been an unmitigated success in demonstrating the efficacy of corpus-based typology, using spoken-language data from underresearched languages. The Multi-CAST collection, a major project outcome, is fully compliant with the principles of Open Science and is already feeding into state-of-the-art research on a range of topics in the language sciences.
Publications
-
2017. Do grammatical relations reflect information status? Reassessing Preferred Argument Structure against discourse data from Tondano. Linguistic Typology 21(1). 177–209
Brickell, Timothy & Schnell, Stefan
-
2018. Discourse motivations for pronominal and zero objects across genres in Vera'a. Language Variation and Change 30(1). 51–81
Schnell, Stefan & Barth, Danielle
-
2018. multicastR: A companion to the Multi-CAST collection. R package version 2.0.0. In Haig, Geoffrey & Schnell, Stefan (eds.), Multi-CAST: Multilingual corpus of annotated spoken texts
Schiborr, Nils N.
-
2021. Doing corpus-based typology with spoken language data: State of the art. Special Publication of Language Documentation and Conservation. Honolulu, HI: University of Hawai'i Press
Haig, Geoffrey & Schnell, Stefan & Seifart, Frank (eds.)
-
2021. Efficiency in discourse processing: Does morphosyntax adapt to accommodate new referents? In Levshina, Natalia & Moran, Steven (eds.), Efficiency in human languages: Corpus evidence for universal principles. Linguistics Vanguard special issue 7(s3)
Schnell, Stefan & Schiborr, Nils N. & Haig, Geoffrey
-
2021. Universals of reference in discourse and grammar: Evidence from the Multi-CAST collection of spoken corpora. In Haig, Geoffrey & Schnell, Stefan & Seifart, Frank (eds.), Doing corpus-based typology with spoken language corpora, 141– 177. Language Documentation & Conservation special publication 25. Honolulu: University of Hawai’i Press
Haig, Geoffrey & Schnell, Stefan & Schiborr, Nils N.
-
2022. Cross-linguistic corpus studies in linguistic typology. Annual Review of Linguistics 8: 171–191
Schnell, Stefan & Schiborr, Nils N.