Project Details
CAVAL: The Classical Armenian Valency Lexikon
Applicant
Petr Kocharov, Ph.D.
Subject Area
General and Comparative Linguistics, Experimental Linguistics, Typology, Non-European Languages
Applied Linguistics, Computational Linguistics
Applied Linguistics, Computational Linguistics
Term
since 2023
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 518003859
The Classical Armenian Valency Lexicon (CAVAL) project is dedicated to creating a corpus-based valency lexicon and to facilitate access to the verbal morphosyntax of Classical Armenian. Classical Armenian is the oldest documented variety of the Armenian language, first recorded in the early 5th century. Its grammar represents an intricate fusion of Indo-European archaisms and innovations resulting from internal changes and the contact influence of the neighbouring languages of Eastern Anatolia and the Southern Caucasus. Despite its value for general, typological and historical linguistics, Classical Armenian remains an under-resourced language. The currently available digital resources include collections of digitized Classical Armenian texts. Only a small share of these texts is provided with morphological annotation, syntactically annotated texts are exceptional. Based on existing resources, CAVAL will make a significant contribution to the natural language processing of Classical Armenian. The project will result in a corpus of morphologically and syntactically annotated texts, based on which a comprehensive valency lexicon will be generated automatically. The lexicon will be provided with a flexible user interface enabling search queries of argument structures, their morphological expression and lexical distribution with text frequencies. The proposed project will continue a series of digital valency lexica created for other ancient Indo-European languages, in particular Latin (IT-VaLex) and Homeric Greek (HoDeL) and will extensively reuse the existing methodological and technological solutions. In addition to the functionality of these two resources, CAVAL will offer important improvements. In particular, the syncretic case marking of verbal arguments typical for Classical Armenian will be processed in two modes, formal and functional (with and without removed formal ambiguity), thus significantly increasing the research potential of the resource. More importantly, CAVAL will implement a diachronic dimension into the valency lexicon. The verbs of Indo-European origin will be provided with an etymological annotation enabling the diachronic integration of CAVAL with the valency lexica of other ancient Indo-European languages, thus initiating a new type of diachronic valency lexicons of Indo-European languages. To date, there is no digital valency lexicon for any variety of Armenian; CAVAL will set a model that can be applied to post-classical varieties of Armenian including the two modern literary languages, Eastern and Western Armenian, and modern Armenian dialects. The project results, including the annotated texts and the program codes, will be made available in open access repositories.
DFG Programme
Research Grants