Skip to content

monum/llm-prototypes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLM Prototypes

The current prototype is a retrieval-based conversational chatbot. We aim to create a chatbot for internal use to help government workers answer public inquiries from Boston residents. To achieve this, we customized the knowledge base of OpenAI’s LLM with hierarchical multi-format data from Boston gov, using LlamaIndex within LangChain. The current iteration of the model is able to give human language responses based on relevant source files retrieved from its knowledge base. It is able to give quantified scores to measure the relevance of sources files and the accuracy of its response.

This repo contains the client and server side code. The web app provides real-time chatting with the chatbot as well as a frontend from uploading/viewing files in the model's current knowledge base. The frontend of the app is a React app. The backend of the app uses Flask and makes significant use of tools like LlamaIndex, mainly for data processing and connecting to vector store. The data store is composed of Azure Blob Storage (stores files and enables quick lookup/download), Azure Cognitive Search (stores files as well as metadata in vector form to enable integration with LLM), and Airtable (scores NoSQL data such as user feedback).

From more detailed introduction and how to install the app locally, please see github wiki.

Week 1 Progress:

  • On top of the current implementation in Flask, make sure the model runs with multiple file inputs, and potentially with file input of different formats.
  • Build the client side of the application in React, including a real-time chatbot response and moving dots indicating response is generating
  • Explore the use of roles and contexts
    • By helping user create structured queries, the responses may become more accurate
    • user is not allowed to become a GPT freerider by telling bot to “forget about everything and answer this”

Week 2 Progress:

  • Implement user feedback system (thumb up/down) on the client side; feedback data, including the current question and chatbot response, are instantly sent to Airtable for analysis.
  • Read documentation for langchain, migrate the llamaindex query engine to a langchain agent framework; currently the agent's toolkit only contains llamaindex
  • Build frontend file upload portal
  • Receive uploaded file on the backend
  • Improve styling of components using Bootstrap or CSS

Week 3 Progress:

  • Fix file upload problem by adding storage context so files persist through uploads and is dynamically integrated to llamaindex's indices without the need to restart the server
  • Add two more tools to the toolkit: serpapi web search tool and ChatGPT plugin
  • Integrate Pinecone vector store
  • Refactor code to reflect a clearer code structure; add comments and improve documentations

Week 4 Progress:

  • Return all thought process on the frontend
  • Explore the potential for text retrieval: return original source text to provide more transparency
  • Add metadata about each file for more accurate and efficient indexing
  • Expand current data sources: web scrapping

Week 5 Progress:

  • Rewire the system to a single language model instead of agent with multiple tools, skip all thought process
  • Implement user feedback popup window and connect to Airtable
  • Create Doc Store table in Airtable to store all source files
  • Add a route to get all current files
  • Start on side project: browser extension that queries current web page content

Week 6 & 7 Progress:

  • Implement frontend file card to show source files and metadata
  • Support url upload: load data from url, break down to nodes, and send to backend
  • Change file storage from Pinecone to Azure Cognitive Search
  • Add frontend to show all files under each category
  • Work on side project while waiting for Azure access

Week 8 Progress:

  • Figure out a way to circumvent Azure's pay wall by generating response based on retrieved files only
  • Rewrite the entire backend script for a clearer code organization
  • Add routes for checking if an index exists in Cognitive, if doesn't create new index
  • Streamline the file processing workflow: store to blob, break down to nodes, store to search index
  • Simplify blob storage by using a single container

Week 9 Progress:

  • Get confidence and relevance score to quantify the accuracy of LLM response
  • Break vectors into nodes with fixed length to avoid loader bug with large-size files
  • Add metadata fields to Cognitive Search so considerations can be made with regard to metadata when retrieving relevant files
  • Pretiffy frontend, redesign sidebar
  • Move the responsibility of file filtering by category from blob storage to cognitive search, since blobs are no longer organized by categories

Week 10 Progress:

  • Fix file upload window resizing bug
  • Prune unnecessary dependencies
  • Add customizables and refactor frontend files
  • Finish wiki
  • Finalize report

About

Experimental projects with LLMs such as langchain agent chatbot; Google Summer of Code

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published