This repository has been archived by the owner on Jun 14, 2018. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 152
Potential bug: output of Tesseract (C-API) and Tesseract (sh) is different #50
Comments
I observed a similar problems when making the tests. The output of libtesseract and tesseract (sh) are slightly different. Quite frankly I'm not sure where this difference comes from. Currently, my main theory is that it comes from the way images are read:
|
I don't know if it's related to this issue, but one difference between tesseract and libtesseract is the default psm. The default psm in tesseract is '3', while in libtesseract it's '6'. |
Good point. It might explain the differences. I'll have a look later. |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
I got this simple example:
Where text.png is:
As this is a fairly simple case I would have expected the outcome to be the same however the outcome is:
Is this a bug or are the two tools configured differently by default?
I know the Tesseract (C-API) works properly on my computer as I have used it successfully with similar but different input, however in this very particular case, it fails.
The text was updated successfully, but these errors were encountered: