CAVAL: The Classical Armenian Valency Lexikon

Applicant Petr Kocharov, Ph.D.

Subject Area General and Comparative Linguistics, Experimental Linguistics, Typology, Non-European Languages
Applied Linguistics, Computational Linguistics

Term since 2023

Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 518003859

Project Description

The Classical Armenian Valency Lexicon (CAVAL) project is dedicated to creating a corpus-based valency lexicon and to facilitate access to the verbal morphosyntax of Classical Armenian. Classical Armenian is the oldest documented variety of the Armenian language, first recorded in the early 5th century. Its grammar represents an intricate fusion of Indo-European archaisms and innovations resulting from internal changes and the contact influence of the neighbouring languages of Eastern Anatolia and the Southern Caucasus. Despite its value for general, typological and historical linguistics, Classical Armenian remains an under-resourced language. The currently available digital resources include collections of digitized Classical Armenian texts. Only a small share of these texts is provided with morphological annotation, syntactically annotated texts are exceptional. Based on existing resources, CAVAL will make a significant contribution to the natural language processing of Classical Armenian. The project will result in a corpus of morphologically and syntactically annotated texts, based on which a comprehensive valency lexicon will be generated automatically. The lexicon will be provided with a flexible user interface enabling search queries of argument structures, their morphological expression and lexical distribution with text frequencies. The proposed project will continue a series of digital valency lexica created for other ancient Indo-European languages, in particular Latin (IT-VaLex) and Homeric Greek (HoDeL) and will extensively reuse the existing methodological and technological solutions. In addition to the functionality of these two resources, CAVAL will offer important improvements. In particular, the syncretic case marking of verbal arguments typical for Classical Armenian will be processed in two modes, formal and functional (with and without removed formal ambiguity), thus significantly increasing the research potential of the resource. More importantly, CAVAL will implement a diachronic dimension into the valency lexicon. The verbs of Indo-European origin will be provided with an etymological annotation enabling the diachronic integration of CAVAL with the valency lexica of other ancient Indo-European languages, thus initiating a new type of diachronic valency lexicons of Indo-European languages. To date, there is no digital valency lexicon for any variety of Armenian; CAVAL will set a model that can be applied to post-classical varieties of Armenian including the two modern literary languages, Eastern and Western Armenian, and modern Armenian dialects. The project results, including the annotated texts and the program codes, will be made available in open access repositories.

DFG Programme Research Grants

Servicenavigation

Hauptnavigation

CAVAL: The Classical Armenian Valency Lexikon

Additional Information

Servicenavigation

Hauptnavigation

CAVAL: The Classical Armenian Valency Lexikon

Additional Information

Textvergrößerung und Kontrastanpassung