Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
-
Updated
Apr 14, 2024 - Python
Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
Extract and Visualize location from any file
📄🚀 Unleash a powerful Document Search Engine with Apache NiFi for lightning-fast, comprehensive text indexing and search.
Apache Tika Server as Debian GNU/Linux and Ubuntu Linux package
Text extraction from scanned pdf documents in java
Configurable Tika Server docker image. https://hub.docker.com/repository/docker/kujira/tika
Tesseract OCR wrapper for Apache Tika and/or Open Semantic ETL caching the OCR results, so Tika-Server or Open Semantic ETL has not to reprocess slow and expensive OCR on same images again
Application in php to test load of pdf files, using docker-compose and apache-tika.
A dockerized image of Apache Tika Server - https://tika.apache.org/
A doc searcher of the documents on the local host that is based on: Tika+OCR, ElasticSearch and Kibana
Polymer 3.0 app for Apache Tika.
A Windows Installer (MSI) for the windows service wrapper of the tika JSR 311 network server.
Container-ized (Docker) GeoTopicParser-Enabled Apache Tika Server with Lucene Geo Gazetteer.
Web crawler with search indexing
Our project is a testament to this need, offering a comprehensive solution that combines modern technologies and architectures to create a powerful document search engine. This engine is not just a tool but a sophisticated ecosystem designed to handle complex data processing and retrieval tasks.
A windows service wrapper for the tika JSR 311 network server.
If you are too lazy to read the whole document then generate wordart and keywords.
Add a description, image, and links to the tika-server topic page so that developers can more easily learn about it.
To associate your repository with the tika-server topic, visit your repo's landing page and select "manage topics."