Project Details
Interactive grammar analysis of historical texts: Adaptive annotation approach to reconstruct the grammatical elaboration of Middle Low German (InterGramm)
Applicants
Professorin Dr. Michaela Geierhos; Professor Dr. Eyke Hüllermeier; Professorin Dr. Doris Tophinke
Subject Area
Applied Linguistics, Computational Linguistics
Term
from 2016 to 2021
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 317446073
Our research project investigates the (structural) elaboration of Middle Low German (MLG) from the 13th century to the written language shift (16/17th century). During this period (16/17th century) MLG lost its dominant position as a supraregional written language accompanied with the growing influence of (written) Early New High German (ENHG). The study makes an important contribution to the reconstruction of grammatical developments in written MLG, which are - until now - examined only to some extent. For several reasons the study focuses on urban legal statutes: We assume that processes of language elaboration can be investigated especially in legal texts. They have to construe complex (legal) issues understandable independent of contextual information, so that elaborated linguistic structures capable of such a construal must be developed. These legal issues are particularly conditional relations; consequently, we are able to examine changes concerning the linguistic construction of conditionality during the investigation period. Furthermore, legal statutes are locatable and dateable, with the result that developmental dynamics of elaboration processes can be spatio-temporally reconstructed. Existing parsing and tagging systems of computational and corpus linguistics require static (a priori defined) grammars and grammatical categories, which does not cope with the historical dynamics of grammar. To solve one of the most central problems of existing annotation tools for historical texts, we plan to develop an interactive computational tool that combines machine learning and expert feedback. Using rule-based text analysis techniques and machine learning methods for "discovering" an evolving, dynamic grammar, and thereby reconstructing language change in an evidence-based way, is a novelty. Since this presupposes both historical language knowledge and skills in computational linguistics and computer sciences, the project requires a close cooperation of disciplines over the entire period of funding. Our empirical base consists of dateable and locatable legal texts from the 13th to the 17th century, building a corpus that is divided into the following subcorpora: The subcorpus MLG (main corpus) consists of Middle Low German texts from 1227 to 1650 and includes about 1.2 million words. The subcorpus ENHG consists of a selection of the first Early New High German texts arising in the Low German language area after the written language shift. This subcorpus includes 400,000 words. Our aim is to verify the assumption that these Early New High German texts instantiate - despite an ENHG lexis - a Middle Low German syntax/grammar.
DFG Programme
Research Grants