Project Details
Workflow for work-specific training based on generic models with OCR-D and upgrading of ground truth data
Applicant
Dr. Sabine Gehrlein
Term
from 2021 to 2023
Project identifier
Deutsche Forschungsgemeinschaft (DFG) - Project number 460547474
The project will implement a workflow that allows institutions to train as easily as possible work-specific models with particularly high recognition rates and domain-specific glyphs for text recognition with OCR-D. This training is based on existing and new generic models which cover a broad spectrum of fonts of different centuries. The creation or improvement of the necessary ground truth will be supported by tools that help to find errors in the ground truth, to correct them easily and to upgrade them to level 2 according to the OCR-D transcription guidelines.
DFG Programme
Research data and software (Scientific Library Services and Information Systems)
Co-Investigator
Stefan Weil