Skip to content

kedir/Specialized-Immigration-Assistant

Repository files navigation

Forks Stargazers Issues MIT License LinkedIn


Specialized-Immigration-Assistant

Project Description

The Specialized Immigration Assistant LLM is an innovative application that leverages the power of LLAMA2, a Large Language Model, to provide specialized legal assistance in the field of immigration law. Our project aims to make legal information more accessible and comprehensible to users seeking guidance in US immigration matters.

(back to top)

Key Features:

Training with Legal Articles: We utilized a comprehensive collection of legal articles sourced from a prominent Law group to train and fine-tune the LLAMA2 Large Language Model.

Customized Immigration Assistant Model: We tailored LLAMA2's capabilities to create a specialized Immigration Assistant model that understands and generates accurate legal language for immigration-related queries.

Improved Accuracy: By fine-tuning the model on a rich dataset of legal nuances and terminology, we achieved improved accuracy and context-specific responses.

Real-time Immigration Information: We are actively working on enabling users to access reliable and up-to-date legal information through an AI interface, contributing to more informed decision-making in US immigration matters.

(back to top)

Built With

  • Python
  • Numpy (Use matrix math operations)
  • PyTorch (Build Deep Learning models)
  • Datasets (Access datasets from huggingface hub)
  • Huggingface_hub (access huggingface data & models)
  • Transformers (Access models from HuggingFace hub)
  • Trl (Transformer Reinforcement Learning. And fine-tuning.)
  • Bitsandbytes (makes models smaller, aka 'quantization')
  • Sentencepiece (Byte Pair Encoding scheme aka 'tokenization')
  • OpenAI (Create synthetic fine-tuning and reward model data)
  • Peft (Parameter Efficient Fine Tuning, use low rank adaption (LoRa) to fine-tune)
  • Jupyter Notebook

(back to top)

Data Sources

  • The dataset provided by private Legal firm and is not included in the GitHub repository for data privacy reasons.

(back to top)

Getting Started

Installation

  1. Install all dependencies using pip
 pip install numpy pandas torch datasets huggingface_hub transformers trl bitsandbytes sentencepiece openai peft evaluate rouge_score

(back to top)

Training

To train the Immigration Assistant model, you have the option to execute the llam2_training.ipynb notebook either locally on your machine or remotely through a cloud service such as Google Colab Pro. It's important to note that the training process requires the availability of a GPU for optimal performance.

In case you don't have access to a GPU, a convenient and cost-effective alternative is to utilize Google Colab Pro, which is available at a monthly cost of $10.

For those interested in gaining deeper insights into the training process and the nuances of our specialized Immigration Assistant model, you can explore detailed information within the llam2_training.ipynb notebook. This notebook provides a comprehensive overview of the training methodology and the underlying mechanisms that make the Immigration Assistant so effective in providing accurate legal guidance for immigration-related queries.

Cloud Training

click here: https://github.com/kedir/Specialized-Immigration-Assistant-LLM-Chatbot/blob/main/notebooks/llam2_training.ipynb/

Local Training

  git clone https://github.com/kedir/Specialized-Immigration-Assistant-LLM-Chatbot.git
  cd notebooks
  jupyter llam2_training.ipynb

(back to top)

Support

Contributions, issues, and feature requests are welcome!

Give a ⭐️ if you like this project!

(back to top)

License

Distributed under the MIT License. See LICENSE.txt for more information.

(back to top)

Contact

(back to top)

Acknowledgments

Meta HuggingFace OpenAI

(back to top)