Project Details
Projekt Print View

Methodology and Reflection: Linguistic Discourse Historiography as Digitally Supported Group Research (TP 5: Methodological Cross-Sectional Project, Phase 2)

Subject Area Individual Linguistics, Historical Linguistics
Term since 2022
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 441142207
 
Subproject 5 is the research group's cross-sectional methodological project. In the first funding phase, it dealt with the automation of discourse-pragmatic annotations in theory and in practical experiments. In funding phase 2, it will specifically further develop the use of large language models (LLMs) for diachronic digital corpus analysis, focusing on the appropriate infrastructure. For this reason, subproject 5 is also responsible for the research group's digital infrastructure. While the first funding phase laid the foundations for collaborative annotation, automatic text classification, and the establishment of a digital research environment, the second phase aims to systematically develop these approaches further in light of recent advances in LLMs. The use case for linguistic operationalization in the first funding phase was topoi. Taggers were also developed for the automatic identification of the utility and authority topoi. In the second phase, the use case will be the identification of frames and conceptual relations. Through the use of retrieval-augmented generation (RAG), diachronic shifts in meaning are to be recorded more precisely and made traceable. In cooperation with the other subprojects, gold standard datasets are being created by means of manual annotation. On this basis, LLMs will be trained, tested, and expanded through RAG workflows. To this end, our digital research environment “Discourse Lab” will be expanded to include a repository for full-text management (Solr/Lucene) and a server environment for protected LLM use (Ollama), and interfaces for export to established annotation and corpus analysis tools will be created. This enables FOR members to index thematically relevant full texts, annotate them manually and automatically, train their own classification models, and quantitatively evaluate the results of project-specific classification. Subproject 5 thus combines the targeted further development of infrastructure and methodology, thereby contributing not only to the goals of the research group, but also to digital discourse analysis as a field of research. The results will be presented in international journals and at international conferences. In addition, subproject 5 participates in joint publications and activities of the research group.
DFG Programme Research Units
International Connection Switzerland
 
 

Additional Information

Textvergrößerung und Kontrastanpassung