Support llama-3 #789

boixu · 2024-04-22T14:53:49Z

Hi

Please add support for llama-3

Currently the prompt template is not compatible since llama-3 uses different style.
Ref: https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3

Currently as is I was unable to use the llama-3 model.

Thanks in advance!

toomy0toons · 2024-05-02T07:31:11Z

h i tried llama-3 and may be you can use the setup.
code is little dirty.

first add template for llama3 in file.
prompt_template_utils.py



def get_prompt_template(system_prompt=system_prompt, promptTemplate_type=None, history=False):
    if promptTemplate_type == "llama3":
        if history:
            prompt = PromptTemplate(
                template="""<|begin_of_text|><|start_header_id|>system<|end_header_id|> You are a helpful assistant, you will use the provided context to answer user questions.
Read the given context before answering questions and think step by step. If you can not answer a user question based on 
the provided context, inform the user. Do not use any other information for answering user. Provide a detailed answer to the question. <|eot_id|><|start_header_id|>user<|end_header_id|>
                Context: {history} \n {context} 
                User: {question} 
                Answer: <|eot_id|><|start_header_id|>assistant<|end_header_id|>""",
                input_variables=["history", "context", "question"],
        )
        else:
            prompt = PromptTemplate(
                template="""<|begin_of_text|><|start_header_id|>system<|end_header_id|> You are a helpful assistant, you will use the provided context to answer user questions.
Read the given context before answering questions and think step by step. If you can not answer a user question based on 
the provided context, inform the user. Do not use any other information for answering user. Provide a detailed answer to the question. <|eot_id|><|start_header_id|>user<|end_header_id|>
                Context: {context} 
                User: {question} 
                Answer: <|eot_id|><|start_header_id|>assistant<|end_header_id|>""",
                input_variables=["context", "question"],
        )
    elif promptTemplate_type == "llama":
        B_INST, E_INST = "[INST]", "[/INST]"
        B_SYS, E_SYS = "<<SYS>>\n", "\n<</SYS>>\n\n"
        SYSTEM_PROMPT = B_SYS + system_prompt + E_SYS

then add option for choosing the llama3 in localGPT
run_localGPT.py

@click.option(
    "--model_type",
    default="llama",
    type=click.Choice(
        ["llama", "mistral", "non_llama", "llama3"],
    ),
    help="model type, llama, mistral or non_llama, or llama3",
)

you can run now with python run_localGPT.py --model_type llama3

here is the model i used for tesitng.

constants.py

# LLAMA 3
MODEL_ID = "unsloth/llama-3-8b-bnb-4bit"
MODEL_BASENAME = None

KerenK-EXRM · 2024-05-02T20:25:47Z

@toomy0toons did you upgrade the llama cpp or transformers version to make this work with llama-3?

toomy0toons · 2024-05-03T00:27:37Z

I did install llama cpp by the readme docs.

i have cuda GPU so i installed the cublas version.

# Example: cuBLAS
CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.83 --no-cache-dir

I did not install anything or upgrade anything besides official insturctions. It works out of the box. but since requirements.txt does not specify a version,
and i installed yesterday my verions might be a more recent one.
my transformers is transformers==4.38.2 now.
@KerenK-EXRM
is there a problem running llama3?

PromtEngineer · 2024-05-03T05:58:57Z

I think since llama2 is probably not going to be used anymore, I will update the prompt template for llama3 as default template.

KerenK-EXRM · 2024-05-04T18:10:41Z

@toomy0toons I tried with another version( QuantFactory/Meta-Llama-3-8B-GGUF) and it did't work.
looks like the project adjusted to support llama3
thank you! cant wait to try :)

VISWANATH78 · 2024-05-08T07:37:16Z

hi i have downloaded llama3 70b model . can some one provide me steps to convert into hugging face model and then run in the localGPT as currently i have done the same for llama 70b i am able to perform but i am not able to convert the full model files to .hf format files. so i would request for an proper steps in how i can perform. please let me know guys any steps please let me know. thank you

carloposo · 2024-05-08T08:44:53Z

I did install llama cpp by the readme docs.

i have cuda GPU so i installed the cublas version.
# Example: cuBLAS
CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.83 --no-cache-dir
I did not install anything or upgrade anything besides official insturctions. It works out of the box. but since requirements.txt does not specify a version, and i installed yesterday my verions might be a more recent one. my transformers is transformers==4.38.2 now. @KerenK-EXRM is there a problem running llama3?

Hi @toomy0toons , trying to do the same but having some issues as per this #793

toomy0toons · 2024-05-08T10:32:17Z

@carloposo
@KerenK-EXRM

my understanding is that the instruct model (8b) has extra set of tokens or has diffenrent prompt template.

try 7b models?

carloposo · 2024-05-08T10:46:48Z

@carloposo @KerenK-EXRM

my understanding is that the instruct model (8b) has extra set of tokens or has diffenrent prompt template.

try 7b models?

No 7B models for llama3 (https://adithyask.medium.com/from-7b-to-8b-parameters-understanding-weight-matrix-changes-in-llama-transformer-models-31ea7ed5fd88)

Do you mean none of the embedding models in constants.py are ok to run any of the llama-3 8b models?

carloposo · 2024-05-09T09:59:51Z

@toomy0toons found out the answer here https://youtu.be/S6PdFPoteBU?si=pSsxCNFJsz_dxn8b&t=551

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support llama-3 #789

Support llama-3 #789

boixu commented Apr 22, 2024

toomy0toons commented May 2, 2024

KerenK-EXRM commented May 2, 2024

toomy0toons commented May 3, 2024 •

edited

PromtEngineer commented May 3, 2024

KerenK-EXRM commented May 4, 2024

VISWANATH78 commented May 8, 2024

carloposo commented May 8, 2024

toomy0toons commented May 8, 2024

carloposo commented May 8, 2024

carloposo commented May 9, 2024

Support llama-3 #789

Support llama-3 #789

Comments

boixu commented Apr 22, 2024

toomy0toons commented May 2, 2024

KerenK-EXRM commented May 2, 2024

toomy0toons commented May 3, 2024 • edited

PromtEngineer commented May 3, 2024

KerenK-EXRM commented May 4, 2024

VISWANATH78 commented May 8, 2024

carloposo commented May 8, 2024

toomy0toons commented May 8, 2024

carloposo commented May 8, 2024

carloposo commented May 9, 2024

toomy0toons commented May 3, 2024 •

edited