Skip to content

OCR and Automatically Categorize Documents, Full-Stack Project

License

Notifications You must be signed in to change notification settings

Junxiao-Liao/Doc-Ocr-Categorizer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Doc-Ocr-Categorizer

Techniques

  • Frontend: React, Antd
  • Backend: FastAPI
  • Relational database: PostgreSQL
  • Storage: MinIO
  • OCR: RapidOCR
  • NLP: multilingual-e5-large-instruct
  • Recommendation Algorithm: pgvector

Overview

The main goal of this project is to design an automatic document recognition and categorization system, using OCR and NLP algorithms, combined with recommendation algorithms, to achieve intelligent processing of documents.

Documentations