Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

partition_pdf: no text orientation detection? #2966

Closed
vivien000 opened this issue May 3, 2024 · 1 comment
Closed

partition_pdf: no text orientation detection? #2966

vivien000 opened this issue May 3, 2024 · 1 comment
Labels
bug Something isn't working pdf

Comments

@vivien000
Copy link

Hi! I'm using partition_pdf on a PDF file that includes tables that have been rotated 90° to the left (ie. the top left cell is on the bottom left part of the page). The OCR step does not take into account this and the output text is incorrect. For example: the output text starts with n o i t a t r o p s n a r T (transportation is the header of the column on the right of the table).

Do I understand correctly that there is no possibility with partition_pdf to detect text orientation before performing OCR?

Thanks!

@vivien000 vivien000 added the bug Something isn't working label May 3, 2024
@scanny scanny added the pdf label May 3, 2024
@christinestraub
Copy link
Contributor

Hi @vivien000, We don't officially support detecting text orientation before performing OCR on rotated text at the moment, though it's something we could add support for in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working pdf
Projects
None yet
Development

No branches or pull requests

3 participants