Do the published training weights "7b_tiva_v0" include all three stages of training results simultaneously? #62

pengxuan001 · 2023-11-01T09:40:31Z

From the training code, the training for each stage will be saved separately in a file, do the published training weights "7b_tiva_v0" include all three stages of training results simultaneously? In addition, in the inference code, input projection layer、output projection layer and lora file with LLM seem to be initialized and not loaded from the existing model "7b_tiva_v0".

ChocoWu · 2023-11-02T02:53:34Z

Hi, the released checkpoint includes all training parameters across all three stages.
During the inference stage, we indeed load the pre-trained parameters from "7b_tiva_v0" after model initialization.
Please refer to the code snippet below:

NExT-GPT/code/demo_app.py

Lines 27 to 28 in e2e2f94

 model = NextGPTModel(**args) 

 delta_ckpt = torch.load(os.path.join(args['nextgpt_ckpt_path'], f'pytorch_model.pt'), map_location=torch.device('cpu'))

NExT-GPT/code/inference.py

Lines 110 to 111 in e2e2f94

 model = NextGPTModel(**args) 

 delta_ckpt = torch.load(os.path.join(args['nextgpt_ckpt_path'], 'pytorch_model.pt'), map_location=torch.device('cuda'))

pengxuan001 · 2023-11-02T03:13:55Z

Hi, the released checkpoint includes all training parameters across all three stages. During the inference stage, we indeed load the pre-trained parameters from "7b_tiva_v0" after model initialization. Please refer to the code snippet below:

NExT-GPT/code/demo_app.py

Lines 27 to 28 in e2e2f94

model = NextGPTModel(**args)

delta_ckpt = torch.load(os.path.join(args['nextgpt_ckpt_path'], f'pytorch_model.pt'), map_location=torch.device('cpu'))

NExT-GPT/code/inference.py

Lines 110 to 111 in e2e2f94

model = NextGPTModel(**args)

delta_ckpt = torch.load(os.path.join(args['nextgpt_ckpt_path'], 'pytorch_model.pt'), map_location=torch.device('cuda'))

@ChocoWu Thank you for your reply. I have another question. How do I save the training results of the three stages in a weight file when I train myself? Can we directly specify the training results of each stage as the same file, such as "7b_tiva_v0"? Will the results of each stage of training be merged or covered?

It seems that the results of the first stage training were not used during the second stage training, and the results of the first and second stages of training were also not used during the third stage training.

ChocoWu · 2023-11-02T03:39:51Z

@pengxuan001, actually, the results of the previous stage training are used during the next stage of training:

NExT-GPT/code/model/agent.py

Line 17 in e2e2f94

self.load_parameters(self.args['save_path'], self.args['stage'])

If you want to separately save the weights trained in different stages, you need to specify a different save path, --save_path. Otherwise, the results will be covered.

jwzhi · 2024-02-06T04:06:31Z

Is there any suggestions on how to load 7b_tiva_v0 during training stage? I tried to continue instruction training on my own data starting from the provided 7b_tiva_v0 checkpoints. However, simply setting save_path= 7b_tiva_v0 does not work. The load checkpoint function during training time seems to always load the vicuna weights instead of nextgpt weights. Thank you for your help.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do the published training weights "7b_tiva_v0" include all three stages of training results simultaneously? #62

Do the published training weights "7b_tiva_v0" include all three stages of training results simultaneously? #62

pengxuan001 commented Nov 1, 2023

ChocoWu commented Nov 2, 2023

pengxuan001 commented Nov 2, 2023 •

edited

ChocoWu commented Nov 2, 2023

jwzhi commented Feb 6, 2024

Do the published training weights "7b_tiva_v0" include all three stages of training results simultaneously? #62

Do the published training weights "7b_tiva_v0" include all three stages of training results simultaneously? #62

Comments

pengxuan001 commented Nov 1, 2023

ChocoWu commented Nov 2, 2023

pengxuan001 commented Nov 2, 2023 • edited

ChocoWu commented Nov 2, 2023

jwzhi commented Feb 6, 2024

pengxuan001 commented Nov 2, 2023 •

edited