MTech Project with collaboration with WNS Global Service
The main goal of this project is to achieve two things: firstly, to create a reliable and open-source system that can efficiently extract information from private PDF files, and secondly, to develop a user-friendly web application that utilizes web scraping techniques to gather relevant data from online sources.
- Internal document RAG system: Extracting information from private PDF files can pose significant challenges for organizations, leading to complexity and inefficiency in their data extraction processes. This challenge impedes their capacity to effectively utilize internal knowledge, resulting in missed chances for improved customer interactions and a competitive advantage in the market. To tackle this urgent problem, it is crucial to create a strong and effective solution for extracting data from PDFs. This will enable organizations to effortlessly extract valuable insights from their private PDF documents
- Web search RAG system: Navigating the vast expanse of the internet can be quite the challenge when it comes to finding relevant and accurate information in a timely manner. While search engines have become essential for finding information, many users struggle to refine their search queries for more accurate results. In addition, current search engines often fail to offer a complete context or provide additional information to improve the user's comprehension of the retrieved data. To tackle these challenges, it is crucial to create a smart system that enhances web search retrieval by offering contextually relevant information and generating extra insights based on user queries
Custom Architecture of RAG System |
High level technical flow |