Project Details
Projekt Print View

Unravelling Linguistic Knowledge via Multilingual Embedding Spaces and Latent Information (B06)

Subject Area Applied Linguistics, Computational Linguistics
General and Comparative Linguistics, Experimental Linguistics, Typology, Non-European Languages
Term since 2015
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 232722074
 
Embeddings (mono- and multi-lingual, static or contextualised) are the workhorses of modern language technologies. They capture semantic, grammatical, morphological and other information. Multi-lingual embeddings are especially promising: word and sentence translations are close in multi-lingual embedding space, allow fine-tuning, few- and zero-shot learning and constitute the core technology underpinning our previous research on Translationese in Phase II. In Phase III, B6 focuses on (i) information spreading in embedding spaces, (ii) translationese subspaces and (iii) extracting tacit background knowledge from translation data, particularly for situations where isomorphism between spaces does not and should not hold, and investigates how (i – iii) impact on information density-based approaches to translation.
DFG Programme Collaborative Research Centres
Applicant Institution Universität des Saarlandes
 
 

Additional Information

Textvergrößerung und Kontrastanpassung