GSSoC'24: OCR Detection #62

SAM-DEV007 · 2024-05-14T18:56:13Z

Resolves #56

The pull request for the OCR Detection resolving the feature enhancement.

OCR_Detection

The OCR Detection is introduced in order to help the victims in the situation where they can not use other means to communicate or seek help, other than
written communication shown to the camera. Also, it helps in detecting potential self-harm when the victim is in the process of writing the death note, and the
camera catches a glimpse of it and can use existing models to determine the scale of the threat that uses text as their primary input.

Usage

It is to be kept in mind that a window will only be created if there are text detected by the model. For visualizing another image, that window has to be closed in order for the another window to appear.

demo.py to start the web camera for obtaining frames.
Ctrl + C to exit from the script.
If a text is detected, a new window opens with the text detected, annotations and the confidence. The detected text is also printed for convenience.

Working

easyocr package is used to provide image to text detection. Model_Data contains the downloaded model to reduce the online dependancy.

detect.py contains the functions that can be imported by other scripts to be executed to perform image to text detection.
demo.py contains a demo code which showcases the functionality.

OpenCV without GUI (opencv-python-headless) is used to optimize the script for detection. It is useful in optimizing the detection speed by removing useless processes used for GUI.
Additionally, for web integration, GUI is not needed but the other functionalities remains the same.

demo.py also contains an optimization which prevents the execution of the model detection if the frame difference is less, i.e., the frames hasn't changed much. MSE (Mean Squared Error) is used to calculate the difference between the two frame. The model only gets executed, if the error is greater than 20. This can be modified by changing the value of ERR_DIFF.

Multi-processing can be used to get seamless detections without delay.

Demo

The image is purely for the demonstration purposes. The red text shows the detected text (the detected text is large so it is cutting from the screen). The bounding boxes, the green text above it and the confidence is for visualization purposes.

The more the resolution of the camera, the better the results.

Added functions to detect text and process data from the image

Removed verbose printing

Added detection from webcam livestream

Added full detected text display

Added mse function

Disabled reading as a paragraph

Fixed confidence bug

Updated the file to fit the requirements of text detection

SAM-DEV007 · 2024-05-17T17:16:06Z

@TAHIR0110 Please review the pull request, and let me know if changes are required or not.

SAM-DEV007 · 2024-05-22T14:44:59Z

@TAHIR0110 Please also add the labels in the PR, the same that is mentioned in the issue.

sudiptasarkar011 · 2024-05-23T06:44:47Z

I would like to work on this, can you please assign this to me?

TAHIR0110 · 2024-06-06T19:26:58Z

@SAM-DEV007 I have merged it and labelled it as level3 instead of level1.

SAM-DEV007 and others added 16 commits May 14, 2024 21:39

Create .gitignore

7520cc3

Create detect.py

5af2960

Create README.md

add7310

Create demo.py

b752310

Update detect.py

bb7765d

Added functions to detect text and process data from the image

Added models for text detection

7f64589

Update detect.py

54c5bf4

Removed verbose printing

Update demo.py

4a162b3

Added detection from webcam livestream

Update demo.py

662bff4

Added full detected text display

Update demo.py

608b6c8

Added mse function

Update detect.py

32a7b5e

Disabled reading as a paragraph

Update detect.py

d81f5ca

Fixed confidence bug

Update demo.py

c8e1dc2

Updated the file to fit the requirements of text detection

Create requirements.txt

4522528

Update README.md

5cf972a

Update README.md

1788fd8

SAM-DEV007 changed the title ~~OCR Detection~~ GSSoC'24: OCR Detection May 18, 2024

TAHIR0110 added level1 gssoc Associated with GSSOC labels May 22, 2024

TAHIR0110 assigned SAM-DEV007 Jun 6, 2024

TAHIR0110 added level2 and removed level1 labels Jun 6, 2024

TAHIR0110 added level3 and removed level2 labels Jun 6, 2024

TAHIR0110 merged commit 84663fb into TAHIR0110:main Jun 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GSSoC'24: OCR Detection #62

GSSoC'24: OCR Detection #62

SAM-DEV007 commented May 14, 2024 •

edited

SAM-DEV007 commented May 17, 2024

SAM-DEV007 commented May 22, 2024

sudiptasarkar011 commented May 23, 2024

TAHIR0110 commented Jun 6, 2024

GSSoC'24: OCR Detection #62

GSSoC'24: OCR Detection #62

Conversation

SAM-DEV007 commented May 14, 2024 • edited

OCR_Detection

Usage

Working

Demo

SAM-DEV007 commented May 17, 2024

SAM-DEV007 commented May 22, 2024

sudiptasarkar011 commented May 23, 2024

TAHIR0110 commented Jun 6, 2024

SAM-DEV007 commented May 14, 2024 •

edited