scanned-documents

Here are 47 public repositories matching this topic...

simranbiswas / Textract

Text Extractor for scanned images and documents. Scans and extracts the content of the file saving loads of time and reduces the chance of typographical error to 0%.

scanned-documents ocr-text-reader text-extractor

Updated Jul 28, 2021
HTML

alucic2 / cluster_htrc

Star

Identifying the boundaries of main content of fiction and non-fiction works in the HathiTrust Extracted Features dataset.

scanned-documents extracting-features clustering-algorithm digital-libraries clustering-analysis smoothing-methods detecting-paratext-boundaries

Updated May 10, 2022
Jupyter Notebook

{{scan|tools|software|headware|progress|open|template|log|log|log|softwaretool|}}{[[:wikt:Scan|log scan]]}. #[[:wikt:log scan|log copyright]]. *[[:wikt:log is log|log]]. *[[:wikt:log scan|txt]]. *[[:wikt:log scan|png]]. *[[:wikt:log scan|image image image/category user/category is /category talkname/category username/category done/category in pr…

linux-kernel scans scanned-documents ubuntu-server linux-server scancode unixporn scansnap-organizer scans-xhr-requests scans-directories

Updated Oct 23, 2020

deckerego / docmag

Star

The web UI for Facile Search. Together with DocIndex, this UI can help you search the myriad of scanned documents you have been accumulating over the years. Using the power of Docker & Elasticsearch you can run a powerful search engine that lets you convert scanned (image-based) PDFs to searchable text, group documents by letterhead, run fuzzy s…

docker kubernetes pdf elasticsearch full-text-search scanned-documents

Updated Oct 26, 2023
Groovy

binDebug3 / scanner_automation

Star

A program to automate simple and repetitive tasks while scanning documents by Dallin Stewart

automation data-entry scan-tool scanned-documents mortgage pyautogui pyautogui-automation

Updated May 12, 2023
Python

hacker-or-id / docs

Star

This is the open-source repo for docs.github.com.

typescript actions scan open-data scan-tool scanned-documents ubuntu-server linux-server sistem scancode

Updated Oct 16, 2020
JavaScript

AdroitAnandAI / Multilingual-Text-Inversion-Detection-of-Scanned-Images

Star

Efficient Text Localization Algorithm, Image Inversion Detection of Scanned Documents & Language Identification based on Shape Context and Traditional Computer Vision.

multilingual computer-vision shape text images detection inversion efficient scanned-documents language-identification shape-context scanned-images image-inversion text-localization traditional-algorithm inversion-detection

Updated Dec 18, 2021
Python

timberger / Searchable-Image-PDF-Creat-O-Mat

Star

This batch script creates a searchable PDF of a PDF with one or more scanned pages which contain images.

pdf ghostscript imagemagick converter ocr drag drop tesseract scan batch scanned-documents batch-script scanned-pages imagemagick-wrapper searchable-pdfs scanned-image-pdfs tesseract-wrapper ghostscript-wrapper searchable-pdf

Updated Oct 22, 2022
Batchfile

Hawk453 / OCR_FOR_PDFS

Star

Optical Character Recognition for Scanned Documents

opencv ocr scanned-documents optical-character-recognition pdfs

Updated Nov 15, 2020
Python

rohanrav / document-scanner

Star

Document scanner created using openCV and python.

python3 scanned-documents opencv-python

Updated Mar 13, 2019
Python

MaxineXiong / Scraping-Scanned-PDF-Docs-using-OCR-with-RPA

Star

This repository contains automation solutions that efficiently extracts text from scanned PDF documents with consistent layouts. Utilizing Tesseract OCR engine, the UiPath RPA robot achieves nearly 90% accuracy, streamlining the process and significantly reducing manual workload.

ocr scanned-documents optical-character-recognition screen-scraping rpa robotic-process-automation uipath uipath-studio scanned-receipts uipath-modern-design uipath-classic-design

Updated Apr 17, 2024

bearrundr / scantailor-custom

Star

scantailor customization add some new functions

image-processing djvu scanned-documents book-scanning digitization

Updated Oct 5, 2019
C++

hnjm / papermerge

Star

Open Source Document Management System for Digital Archives (Scanned Documents)

python pdf django ocr archives scan scanned-documents dms document-management paperless hnjm

Updated Jan 5, 2023
Python

legenscandary / scan

Star

An automatic scan server software for scanners with document feeder. It creates multi-page PDFs with selectable text (OCR) by just one button press.

pdf ocr samba scanned-documents shell-scripts pdf-generation scanning

Updated Feb 19, 2024
Shell

rbrito / pkg-pdfbeads

Star

Debian packaging of pdfbeads

pdf pdf-converter scanned-documents pdf-generation scanning scanned-image-pdfs

Updated May 11, 2020
Ruby

milahu / document-photo-auto-threshold

Star

auto-correct contrast and brightness of photographed document

image-processing contrast brightness scan-tool scanned-documents postprocessing contrast-enhancement brightness-adjustment

Updated Oct 12, 2021
Python

paulveillard / cybersecurity-internet-scanning

Star

An ongoing & curated collection of awesome software best practices and techniques, libraries and frameworks, E-books and videos, websites, blog posts, links to github Repositories, technical guidelines and important resources about Internet Scanning in Cybersecurity

scanner scanned-documents scanning scanning-tool