Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

能否识别PDF文件呢? #92

Closed
chengyuyu opened this issue Jan 17, 2024 · 3 comments
Closed

能否识别PDF文件呢? #92

chengyuyu opened this issue Jan 17, 2024 · 3 comments

Comments

@chengyuyu
Copy link

原版的PaddleOCR可以识别PDF文件,能否增加对pdf文件的图片内容的识别呢?

@hiroi-sora
Copy link
Owner

Umi-OCR 正在开发PDF识别功能。PDF解析部分将由Umi本身 而不是PaddleOCR引擎负责。PaddleOCR-json 暂时没有更新计划。

@yangyunlv
Copy link

为啥我用Umi-OCR打印识别结果的时候比这里多出了两个'from': 'text', 'end': '',是这边的版本还没更新吗

@hiroi-sora
Copy link
Owner

是这边的版本还没更新吗

PDF解析部分由Umi-OCR中的组件负责。这边 PaddleOCR-json 是单纯的OCR引擎,没有PDF解析功能。

"from" 和 "end" 也是 Umi 的解析结果,与 Paddle 无关。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants