Generative Question and Answer Large Language Model

Project Overview

DataSpeak, one of the industry's largest providers of predictive analytics solutions, needed a proof-of-concept machine learning model that can automatically generate answers to user-input questions.

Machine Learning Skills/Technologies

Text2TextGeneration, Transformers, Tokenizers, PyTorch, Hugging Face, Flan-T5 LLM, spaCy, Streamlit, Render, GPU, BeautifulSoup, Google Colab

Project Conclusions

Developed a generative language model using google/flan-t5-base, fine-tuned on Stack Overflow data.
Conducted cosine semantic similarity analysis on a generated vector embeddings database to identify the top 5 most similar questions in the dataset for user-input questions.
Developed a web application featuring a chatbot UI that provides generative answers from the model and generates 5 alternative answers based on cosine similarity, along with percent similarity scores.
Improved training set quality by pre-processing and normalizing raw text data.

Screenshot of Web Application UI

Performance & Evaluation

Achieved a 19% ROUGE-1 score and an average perplexity of 1.96.
Demonstrated high efficiency, with response times under 15 seconds.

Requirements

Python libraries: pandas, numpy, matplotlib, seaborn, scikit-learn, nltk, transformers, spacy, torch

Data Description:

Python Questions from Stack Overflow

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
.streamlit		.streamlit
datasets		datasets
notebooks		notebooks
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Generative Question and Answer Large Language Model

Project Overview

Machine Learning Skills/Technologies

Project Conclusions

Screenshot of Web Application UI

Performance & Evaluation

Requirements

Data Description:

About

Releases

Packages

Languages

laceymalarky/nlp_question_answer

Folders and files

Latest commit

History

Repository files navigation

Generative Question and Answer Large Language Model

Project Overview

Machine Learning Skills/Technologies

Project Conclusions

Screenshot of Web Application UI

Performance & Evaluation

Requirements

Data Description:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages