Project Details
Projekt Print View

OCR4all-libraries – Full-Text Transformation of Historical Collections

Term from 2021 to 2024
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 460665940
 
The projected cooperation between the Georg Eckert Institute - Leibniz Institute of International Textbook Studies (GEI), the “Kallimachos” Centre of Philology and Digitality (ZPD) and the chair of Human Computer Interaction (HCI) at the University of Würzburg aims to upgrade and adjust the open-source GUI-based tool “OCR4all”, helping libraries and archives undertaking mass-digitization to implement the solutions developed as part of the “OCR-D project” in a low-threshold, flexible and self-sufficient manner. The GEI research library, with its collection of digitized 17th and 18th century textbooks, will serve as use-case. The OCR quality of its digitized inventory varies greatly, in part because complex layouts and mixed typography tend to impede a high-quality text recognition. In order to improve OCR quality, an ideally generic method will be implemented, enabling the full-text recognition of entire collections of similar documents. In the interest of user friendliness, and in order to compensate for the growing complexity of such OCR solutions, the user interface will be regularly adjusted and refined in close cooperation with and under the supervision of the HCI. In addition, a visual explanation component will support the creation and configuration of optimal OCR workflows. All solutions developed during the project will be evaluated every step of the way with comprehensive user assessments, to ensure that lay users in libraries and archives are able to use OCR-D solutions comfortably and autonomously.
DFG Programme Research data and software (Scientific Library Services and Information Systems)
 
 

Additional Information

Textvergrößerung und Kontrastanpassung