Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support llama-3 #789

Open
boixu opened this issue Apr 22, 2024 · 10 comments
Open

Support llama-3 #789

boixu opened this issue Apr 22, 2024 · 10 comments

Comments

@boixu
Copy link

boixu commented Apr 22, 2024

Hi

Please add support for llama-3

Currently the prompt template is not compatible since llama-3 uses different style.
Ref: https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3

Currently as is I was unable to use the llama-3 model.

Thanks in advance!

@toomy0toons
Copy link

h i tried llama-3 and may be you can use the setup.
code is little dirty.

first add template for llama3 in file.
prompt_template_utils.py



def get_prompt_template(system_prompt=system_prompt, promptTemplate_type=None, history=False):
    if promptTemplate_type == "llama3":
        if history:
            prompt = PromptTemplate(
                template="""<|begin_of_text|><|start_header_id|>system<|end_header_id|> You are a helpful assistant, you will use the provided context to answer user questions.
Read the given context before answering questions and think step by step. If you can not answer a user question based on 
the provided context, inform the user. Do not use any other information for answering user. Provide a detailed answer to the question. <|eot_id|><|start_header_id|>user<|end_header_id|>
                Context: {history} \n {context} 
                User: {question} 
                Answer: <|eot_id|><|start_header_id|>assistant<|end_header_id|>""",
                input_variables=["history", "context", "question"],
        )
        else:
            prompt = PromptTemplate(
                template="""<|begin_of_text|><|start_header_id|>system<|end_header_id|> You are a helpful assistant, you will use the provided context to answer user questions.
Read the given context before answering questions and think step by step. If you can not answer a user question based on 
the provided context, inform the user. Do not use any other information for answering user. Provide a detailed answer to the question. <|eot_id|><|start_header_id|>user<|end_header_id|>
                Context: {context} 
                User: {question} 
                Answer: <|eot_id|><|start_header_id|>assistant<|end_header_id|>""",
                input_variables=["context", "question"],
        )
    elif promptTemplate_type == "llama":
        B_INST, E_INST = "[INST]", "[/INST]"
        B_SYS, E_SYS = "<<SYS>>\n", "\n<</SYS>>\n\n"
        SYSTEM_PROMPT = B_SYS + system_prompt + E_SYS

then add option for choosing the llama3 in localGPT
run_localGPT.py

@click.option(
    "--model_type",
    default="llama",
    type=click.Choice(
        ["llama", "mistral", "non_llama", "llama3"],
    ),
    help="model type, llama, mistral or non_llama, or llama3",
)

you can run now with python run_localGPT.py --model_type llama3

here is the model i used for tesitng.

constants.py

# LLAMA 3
MODEL_ID = "unsloth/llama-3-8b-bnb-4bit"
MODEL_BASENAME = None

@KerenK-EXRM
Copy link
Contributor

@toomy0toons did you upgrade the llama cpp or transformers version to make this work with llama-3?

@toomy0toons
Copy link

toomy0toons commented May 3, 2024

I did install llama cpp by the readme docs.

i have cuda GPU so i installed the cublas version.

# Example: cuBLAS
CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.83 --no-cache-dir

I did not install anything or upgrade anything besides official insturctions. It works out of the box. but since requirements.txt does not specify a version,
and i installed yesterday my verions might be a more recent one.
my transformers is transformers==4.38.2 now.
@KerenK-EXRM
is there a problem running llama3?

@PromtEngineer
Copy link
Owner

I think since llama2 is probably not going to be used anymore, I will update the prompt template for llama3 as default template.

@KerenK-EXRM
Copy link
Contributor

@toomy0toons I tried with another version( QuantFactory/Meta-Llama-3-8B-GGUF) and it did't work.
looks like the project adjusted to support llama3
thank you! cant wait to try :)

@VISWANATH78
Copy link

hi i have downloaded llama3 70b model . can some one provide me steps to convert into hugging face model and then run in the localGPT as currently i have done the same for llama 70b i am able to perform but i am not able to convert the full model files to .hf format files. so i would request for an proper steps in how i can perform. please let me know guys any steps please let me know. thank you

@carloposo
Copy link

I did install llama cpp by the readme docs.

i have cuda GPU so i installed the cublas version.

# Example: cuBLAS
CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.83 --no-cache-dir

I did not install anything or upgrade anything besides official insturctions. It works out of the box. but since requirements.txt does not specify a version, and i installed yesterday my verions might be a more recent one. my transformers is transformers==4.38.2 now. @KerenK-EXRM is there a problem running llama3?

Hi @toomy0toons , trying to do the same but having some issues as per this #793

@toomy0toons
Copy link

@carloposo
@KerenK-EXRM

my understanding is that the instruct model (8b) has extra set of tokens or has diffenrent prompt template.

try 7b models?

@carloposo
Copy link

@carloposo @KerenK-EXRM

my understanding is that the instruct model (8b) has extra set of tokens or has diffenrent prompt template.

try 7b models?

No 7B models for llama3 (https://adithyask.medium.com/from-7b-to-8b-parameters-understanding-weight-matrix-changes-in-llama-transformer-models-31ea7ed5fd88)

Do you mean none of the embedding models in constants.py are ok to run any of the llama-3 8b models?

@carloposo
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants