Skip to content

Python full-stack application that leverages technologies such as Python, PyPDF2, Langchain, Firebase, Lottie, Faiss, Hugginface embedding models, and Streamlit to facilitate multi-PDF analysis through natural language processing, providing users with a seamless and intuitive experience for processing PDFs and obtaining content-related insights

Notifications You must be signed in to change notification settings

Rale01/Bachelor-s-Thesis-SAAS-Document-Analysis-Application-for-PDF-Documents

Repository files navigation

DIPLOMSKI RAD SAAS APLIKACIJA ZA ANALIZU DOKUMENATA U PDF FORMATU📚🎓

Screenshot_7

PDF Inquisitor🗎🔎

PDF Inquisitor is a sophisticated Python application that simplifies multi-dimensional interaction with PDF documents. This innovative tool goes beyond conventional PDF document viewing, allowing users to have natural language conversations with multiple PDF documents simultaneously. Leveraging state-of-the-art language models, this application promises to extract diverse information from the textual content within PDF documents. Please note that the application's responsiveness is dependent on questions

Technologies used🧑‍💻

коришћене технологије

The main function of embeding models

Image

"PDF Inquisitor" tracks structured work processes to ensure accurate answers to user questions:

  • Loading PDFs: The application starts by reading multiple PDF documents, extracting their textual content, and preparing it for analysis.
  • Text segmentation: For processing optimization, the extracted text is divided into smaller, more manageable segments. This process enables efficient processing of the textual content of PDFs.
  • Language model: A language model is used to generate vector representations (embeddings) of textual segments. These embeddings capture the semantic meaning of the text.
  • Similarity matching: When a user poses a question, the application compares it with textual segments and identifies the most semantically similar segments.
  • Answer generation: The selected textual segments are then passed to a language model that generates a coherent answer based on the relevant content of the PDFs. This answer is displayed to the user, providing answers to their questions.

About

Python full-stack application that leverages technologies such as Python, PyPDF2, Langchain, Firebase, Lottie, Faiss, Hugginface embedding models, and Streamlit to facilitate multi-PDF analysis through natural language processing, providing users with a seamless and intuitive experience for processing PDFs and obtaining content-related insights

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages