Project Details
Syntactic patterns in Pite Saami: A corpus-based exploration of 130 years of variation and change
Applicant
Dr. Joshua Wilbur
Subject Area
Individual Linguistics, Historical Linguistics
General and Comparative Linguistics, Experimental Linguistics, Typology, Non-European Languages
Applied Linguistics, Computational Linguistics
General and Comparative Linguistics, Experimental Linguistics, Typology, Non-European Languages
Applied Linguistics, Computational Linguistics
Term
from 2016 to 2021
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 286335341
The ultimate goal of this project is to create thorough, corpus-based descriptions of syntactic patterns in Pite Saami, a highly endangered Uralic language spoken in Swedish Lapland. The corpus will consist of Pite Saami texts in the spoken mode representing more than a century of language use from the late 19th up to the early 21st centuries. With this in mind, the age of source data will be treated as a potential factor in explaining variation in attested patterns, thus allowing for the investigation of structural changes through time.Specifically, the project will use quantitative methods in attempting to answer the following main research questions. 1. Which constituents are possible and/or required in the various Pite Saami phrase and clause types? Is there a preference for certain structures? 2. What effect does information structure have on constituent structure? 3. Does the corpus provide evidence for diachronic changes in syntactic patterns? If so, which patterns are affected?In order to carry out the investigation, an annotated corpus must first be created. To do this efficiently, extant language technology tools will be refined to automatically tag Pite Saami texts for lexeme, morphological categories and part-of-speech.The results of the project will be three-fold: 1. a book-length description of attested syntactic patterns; 2. a thoroughly annotated, digital spoken language corpus spanning more than a century of texts for an endangered Saami language, to be available for further research; and 3. a model for the use of language technology tools to automatically annotate a spoken language corpus for an endangered language.The planned syntactic description will provide new data concerning a hitherto under-described language. These data will not only be of interest to Uralic language scholars, particularly for historical-comparative studies, but also to synchronic comparative theoretical linguists with both formal and functional approaches.
DFG Programme
Research Grants