The repo gt_structure_1_3 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.
-
Updated
May 21, 2024
The repo gt_structure_1_3 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.
OCR-D wrapper for page-xml-draw
About The repo gt_structure_1_4 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.
The GBN Dataset consists German-Brazilian historical newspapers, along with their digital and binarized images and ground truth files.
The repo gt_structure_1_2 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.
The GBN Dataset consists German-Brazilian historical newspapers, along with their digital and binarized images and ground truth files.
The repo gt_structure_1_1 is part of the OCR-D Ground Truth Structure corpus. Only the structure of the printed page is annotated. The corpus was created as a result of the DFG project OCR-D.
OCR-D guidelines for Ground Truth production
A powerful CLI tool for visualization and encoding of PAGE-XML files
XSLT and shell scripts for analyzing and creating GitHub pages of a ground truth repository. These are centrally managed and can be used by all repositories created with gt-repo-template (https://github.com/OCR-D/gt-repo-template).
Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)
Page to PAGE Layout Analysis Tool
Add a description, image, and links to the page-xml topic page so that developers can more easily learn about it.
To associate your repository with the page-xml topic, visit your repo's landing page and select "manage topics."