Handwritten Form Reader

This OCR pipeline attempts to detect text in a cropped handwritten forms. In order to convert cropped form data to text, synthetic data of the required type was generated and trained using a modified version of the attention OCR model.

Sample prediction on real data

To reproduce the results of the pipeline on test data kindly refer instructions.txt and for a details about the analysis, comparison and hyperparameter tuning of the neural network refer report.pdf. A log report, log.md has also been included in this repository which constains succinct description of daily work done.

Folder Structure

Find the folder tree below. Note that some details have been omitted for brevity and can be found in the respective repositories of the source code, kindly refer a separate README.md in such cases.

Root<br>
|
| .gitignore
| requirements.txt
| log.md
| README.md
| MANIFEST.in
| setup.py
| instructions.md
| myrun.sh
| Report.pdf
|
|___aocr
|   |
|   | __main__.py
|   | __init__.py
|   | defaults.py
|   | LICENSE.md
|   | README.md
|   |
|   |____model
|   |    |
|   |    | __init__.py
|   |    | cnn.py
|   |    | model.py
|   |    | seq2seq.py
|   |    | seq2seq_model.py
|   |
|   |____util
|        | 
|        | __init__.py
|        | bucketdata.py
|        | data_gen.py
|        | dataset.py
|        | export.py
|        | visualizations.py  
|     
|___text_renderer
|   |
|   | main.py
|   | README.md
|   | setup.py
|   |
|   |____dataset_labels
|   |    |
|   |    | convert_labels.py
|   |
|   |____ocr_data
|   |
|   |____example_data
|   |
|   |____text_renderer
|   |
|   |____tools
|   |
|   |____docs
|   |
|   |____docker
|   
|
|____experiments
|   |
|   | TestSyntheticDataGen.ipnb
|   | Tfwriter.ipnb
|   | Train.ipnb
|   
|____checkpoints
|   
|____app
|   
|____datasets
|   
|____utils
|

The folder aocr contains the main code for the attention ocr along with a separate README.md which can be used as a reference.
The folder text_renderer contains the code used for generation of synthetic dataset. The exact details of configuration used is inside text-renderer/ocr_data/gen_data.py.
Since this model was trained and tested on Google Colab, sample ipnb files have been provided for reference in experiments.
And the app folder contains the flask based REST API for testing the endpoints
The model checkpoints are located in the checkpoints directory
Please read Report.pdf for a detailed summary of work done!

Reproduce results on local machine

In order to run the app on your local machine follow these steps:

Clone the repository on your local machine:

git clone https://github.com/java-abhinav07/abhinav_java_9873155323-IITB-Assignment-Jul-Dec2020-Batch2.git
Install aocr locally using setup.py:
1. cd abhinav_java_9873155323-IITB-Assignment-Jul-Dec2020-Batch2
2. pip3 install -e ./
Install necessary packages:

pip3 install -r requirements.txt
Having installed all the packages run: python3 app/app.py
This will run the server on the local machine on port 8001(note that CPU inference might take upto 14 seconds to process)

Send the request to localhost as follows:

Correct API Spec:

{
    "public_id": "74f4c926-250c-43ca-9c53-453e87ceacd1",
    "version": "v1",
    "max_width": "320",
    "max_height": "40",
    "data": {
        "image_url": "http://0.0.0.0:8000/Projects/IITB_Assignment/dataset/public/public_test_crops/TCFCD0291000010459388_M_pdf-1_datefrom.jpg"
    }
}

Incorrect API Spec:

Response Status will either be completed or invalid request to indicate a successful or unsuccessful response respectively.

Inference on Local Machine

To execute inference on your local machine you can also use the bash script provided as follows:
./my_run.sh TestImageFolderPath Output.txt

Subsequently the output file output.txt will have output:
<Testimagefilename1> <recognized text> <Testimagefilename2> <recognized text>

Access Heroku App

Inorder to run request over a public web server use the following path:
https://formreader.herokuapp.com/predict

Use the same API Spec shared above in order to fetch the results.
Heroku app has been modified and is now fully functional however the model initialization (first request) might take a little time due to low memeory constraints on heroku.

References

This repository contains code from the following two repositories:

https://github.com/emedvedev/attention-ocr : aocr
https://github.com/oh-my-ocr/text_renderer : text_renderer

The reports contains a list of papers which were referenced during the design of this ocr rendition.

Acknowledgements

This project was part of the application process for Research Internship at IIT Bombay.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Handwritten Form Reader

Folder Structure

Reproduce results on local machine

Inference on Local Machine

Access Heroku App

References

Acknowledgements

About

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 304 Commits
.vscode		.vscode
aocr		aocr
app		app
checkpoints		checkpoints
datasets		datasets
experiments		experiments
resources		resources
text_renderer		text_renderer
utils		utils
.gitignore		.gitignore
MANIFEST.in		MANIFEST.in
Procfile		Procfile
README.md		README.md
Report.pdf		Report.pdf
instructions.md		instructions.md
log.md		log.md
my_run.sh		my_run.sh
requirements.txt		requirements.txt
setup.py		setup.py

java-abhinav07/formreader

Folders and files

Latest commit

History

Repository files navigation

Handwritten Form Reader

Folder Structure

Reproduce results on local machine

Inference on Local Machine

Access Heroku App

References

Acknowledgements

About

Topics

Resources

Stars

Watchers

Forks

Languages