Project Details
Projekt Print View

Developments of methods in the context of image processing and digital media for palaeographic research, in particular the transcription of handwritings, writer comparisons, and writer profiles in historical documents - Diptychon -

Subject Area Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Term from 2011 to 2017
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 203606267
 
Diptychon is a transcription-assistant-system that automatically separates glyphs and tries to classify them. The user can employ easily a number of interactive methods in order to correct any defective suggestions. In the renewal proposal there are five thematic groups which are to be investigated: (i) The analysis of editions: they are to be imported into the system in order to analyse the original document image according to such an edition. This is a specific subproblem in the context of unconstrained handwriting analysis and search. (ii) Writer specific dictionaries are to be employed in order to improve the performance of automatic suggestions. Such dictionaries consider all the peculiarities of an individual writer. In particular the context information available with such dictionaries enable the dealing with defective glyphs. (iii) Meta-data are to be used for dealing with special characters, document defects, as well as multi-column and differently oriented texts. Additionally, abbreviations which are typical for mediaeval texts are to be considered appropriately in the transcription. All these issues are necessary in order to provide a complete edition of a text. (iv) A fundamental issue is the glyph-separation-ambiguity. According to it, it is difficult for both algorithms and users to separate glyphs appropriately. Interactive methods and features which are invariant with respect to glyph-transitions are to be developed. In this way, new means for characterising writers are provided, that is by the way how glyph-transitions look like. (v) The export of transcriptions in a sophisticated format is required, which sticks to the rules of XML-TEI. In particular for certificates of mediaeval emperors a specific schema is to be defined. A benchmark test with a transcribed data set of such documents is to be provided to the scientific community. Among others, this is of interest to compare machine-learning- and classification algorithms.
DFG Programme Research Grants
Participating Person Professor Dr. Michael Menzel
 
 

Additional Information

Textvergrößerung und Kontrastanpassung