Syntaktische Muster im Pitesaamischen: eine korpus-basierte Untersuchung von Variation und Wandel über 130 Jahre
Allgemeine und Vergleichende Sprachwissenschaft, Experimentelle Linguistik, Typologie, Außereuropäische Sprachen
Angewandte Sprachwissenschaften, Computerlinguistik
Zusammenfassung der Projektergebnisse
This project was a linguistics project focussed on exploring linguistic structures in Pite Saami, a highly endangered Uralic language spoken by around 30 individuals from the Arjeplog municipality in Swedish Lapland. The project’s goal was to understand syntactic patterns in spoken-language texts spanning more than a century of documented Pite Saami language use. Specifically, the project looked at syntactic structures; in other words, how can individual words be combined to form phrases, and how can phrases in turn be combined to form clauses and sentences. Aside from adding to our general knowledge about how human language can work, the project also explored how to formalize this understanding of Pite Saami syntactic structures in a way that a computer can understand. As a result of the project, computation tools were developed that automatically analyze Pite Saami texts; in other words, you can input a Pite Saami sentence into the developed computer applications, which then output which words are involved (lemmatization), which linguistic categories are present (morphology and part of speech), and even provide a rough word-for-word English translation. In addition to the usefulness of this research in automatically analyzing large amounts of Pite Saami texts for further research, these can also be used for developing spell-checkers and grammar-checkers, which can be especially valuable for such a small language community like Pite Saami. Lastly, the Pite Saami collection at the Endangered Languages Archive grew in size and quality as a result of this project; this collection will serve as a dataset for future investigations into Pite Saami language and culture.
Projektbezogene Publikationen (Auswahl)
- 2017. “Instant Annotations: Applying NLP Methods to the Annotation of Spoken Language Documentation Corpora”. In Proceedings of the Third International Workshop on Computational Linguistics for Uralic Languages: Proceedings of the Workshop. ACL Anthology. St. Petersburg, Russia: Association for Computational Linguistics. 25-36
Gerstenberger, Ciprian, Niko Partanen, Michael Rießler, J. Wilbur
(Siehe online unter https://doi.org/10.18653/v1/w17-0604) - 2018. Pite Saami Finite State Transducer (morphological parser) and Disambiguator (Constraint Grammar syntactic disambiguator)
Wilbur, J.
- 2019. “ELAN as a search engine for hierarchically structured, tagged corpora”. In Proceedings of the 5th International Workshop for Computational Linguistics for Uralic Languages (IWCLUL 2019). Tartu: Association for Computational Linguistics. 90-103
Wilbur, J.
(Siehe online unter https://dx.doi.org/10.18653/v1/W19-0308) - 2019. “Using computational approaches to integrate endangered language legacy data into documentation corpora. Past experiences and challenges ahead”. In Proceedings of the Workshop on Computational Methods for Endangered Languages. Vol. 2. Honolulu: Association for Computational Linguistics. 24-30
Blokland, Rogier, Niko Partanen, Michael Rießler, J. Wilbur
(Siehe online unter https://doi.org/10.33011/computel.v2i.451) - 2021. “Envisioning digital methods for fieldwork in the Arctic.” In M. Lehtimäki, A. Rosenholm & V. Strukov (eds.), Visual Representations of the Arctic: Imagining Shimmering Worlds in Culture, Literature and Politics. Routledge Interdisciplinary Perspectives on Literature. London: Routledge. 313-339
Partanen, Niko, Michael Rießler, J. Wilbur
(Siehe online unter https://doi.org/10.4324/9781003158295-22)