Identifying the boundaries of main content of fiction and non-fiction works in the HathiTrust Extracted Features dataset.
-
Updated
May 10, 2022 - Jupyter Notebook
Identifying the boundaries of main content of fiction and non-fiction works in the HathiTrust Extracted Features dataset.
TWAIN Scanning SDK for 64 bit and 32 bit MS Access, VB.NET, C#, Delphi and Visual C++ and 32 bit Visual Basic 6 and VFP.
For Windows Developers who need to capture image from scanner, digital camera or capture card that has a TWAIN device driver with C++, C#, VB.NET , VB, Delphi, Vfp, MS Access.
Document scanner created using openCV and python.
scantailor customization add some new functions
An automatic scan server software for scanners with document feeder. It creates multi-page PDFs with selectable text (OCR) by just one button press.
Text Extractor for scanned images and documents. Scans and extracts the content of the file saving loads of time and reduces the chance of typographical error to 0%.
Scanned digits detector and classifier (CNN, OpenCV)
This is the open-source repo for docs.github.com.
A document indexing daemon that can populate Elasticsearch indexes with the contents and metadata of a number of document types including PDF, image scans, etc. Used to power Facile Search, however can be re-used for anything that requires search indexing for scanned documents.
Apply OCR on scanned PDF files to extract text from the PDF images.
This repository contains automation solutions that efficiently extracts text from scanned PDF documents with consistent layouts. Utilizing Tesseract OCR engine, the UiPath RPA robot achieves nearly 90% accuracy, streamlining the process and significantly reducing manual workload.
Debian packaging of pdfbeads
An ongoing & curated collection of awesome software best practices and techniques, libraries and frameworks, E-books and videos, websites, blog posts, links to github Repositories, technical guidelines and important resources about Internet Scanning in Cybersecurity
{{scan|tools|software|headware|progress|open|template|log|log|log|softwaretool|}}{[[:wikt:Scan|log scan]]}. #[[:wikt:log scan|log copyright]]. *[[:wikt:log is log|log]]. *[[:wikt:log scan|txt]]. *[[:wikt:log scan|png]]. *[[:wikt:log scan|image image image/category user/category is /category talkname/category username/category done/category in pr…
Optical Character Recognition for Scanned Documents
The web UI for Facile Search. Together with DocIndex, this UI can help you search the myriad of scanned documents you have been accumulating over the years. Using the power of Docker & Elasticsearch you can run a powerful search engine that lets you convert scanned (image-based) PDFs to searchable text, group documents by letterhead, run fuzzy s…
auto-correct contrast and brightness of photographed document
Searching for a text using OCR, detection and recognition of tables in scanned documents.
Add a description, image, and links to the scanned-documents topic page so that developers can more easily learn about it.
To associate your repository with the scanned-documents topic, visit your repo's landing page and select "manage topics."