Skip to content

Personal project: hand-made OCR machine learning model to recognize digits.

Notifications You must be signed in to change notification settings

itslinotlie/digit-recognizer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Digit Recognizer

The Superior Pizza

With a quick glance, it is easy to identify this image as a pizza—for the cultured sensible ones at the very least. We can accomplish such by transmitting and communicating information across different areas of the brain with the help of 85+ billion neurons. Image recognition is something we never think twice about, but how would a computer recognize images?

It is possible to hardcode what each picture represents with conditional statements, but what happens when there are modifications to the image? What if the image is tilted or is a different colour? At a certain point, there will be too many unknown variables to account for and conditional statements won't be viable anymore. Conditional statements work well when there aren't too many possibilities, such as in a Tic-Tac-Toe game, but in this case, something stronger than conditional statements is needed. Enter neural networks.

Our vast system of intertwined neurons is the inspiration for the widely used machine learning technique called Artificial Neural Networks (ANN), the technique which builds upon the foundation for computer vision.


After learning the theory of backtracking and itching to code a small-scale project, I decided to set foot into creating a Sudoku solver from scratch, completely underestimating the complexity of optical character recognition—the technology used to decipher values from images.

In this repoistory, I will document my journey through understanding neural networks, deriving the backpropagation algorithm, and training my model through the MNIST digit database.


The GUI

GUI Screen

Neural networks function similarly to a black box: knowledge about the inputs and outputs is understood, but what happens within is not. Training the network for days only to see the "accuracy" displayed at the end was far from helpful. Without the tools to verify the accuracy, the number had no meaning. The solution? Create a GUI that allows images to be drawn and recognized on-demand, showcasing the network's supposed accuracy. Bonus: visualize what the MNIST digits look like, so when the predictions are off, you understand why (some of the digits are funky).

About

Personal project: hand-made OCR machine learning model to recognize digits.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages