Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add gpt2 based llama2 #529

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

add gpt2 based llama2 #529

wants to merge 4 commits into from

Conversation

loxs123
Copy link
Contributor

@loxs123 loxs123 commented Jan 22, 2024

基于libai gpt2实现的llama2

@loxs123
Copy link
Contributor Author

loxs123 commented Jan 22, 2024

初步测试

在推理过程,基于gpt2实现llama2结果与原始llama相同,均为

[{'generated_text': 'Give three tips for staying healthy.\nWhat is the best way to stay healthy?\nWhat are the 5 ways to stay healthy?\nWhat are the 5 ways to stay healthy?\nWhat are the '}]

训练过程对比原始libai llama实现初步时间
基于gpt2实现llama训练总时间约为11小时[4卡],其余参数按照原仓库配置

[01/22 13:07:19 lb.utils.events]:  eta: 11:00:05  iteration: 9/37320  consumed_samples: 320  total_loss: 1.49  time: 1.0820 s/iter  data_time: 0.0027 s/iter total_throughput: 3.70 samples/s lr: 3.62e-08  
[01/22 13:07:30 lb.utils.events]:  eta: 11:06:09  iteration: 19/37320  consumed_samples: 640  total_loss: 1.415  time: 1.0827 s/iter  data_time: 0.0026 s/iter total_throughput: 3.69 samples/s lr: 7.64e-08

原始llama实现训练总时间也约为11小时[4卡]

[01/22 13:03:18 lb.utils.events]:  eta: 10:55:21  iteration: 9/37320  consumed_samples: 320  total_loss: 1.49  time: 1.0764 s/iter  data_time: 0.0026 s/iter total_throughput: 3.72 samples/s lr: 3.62e-08  
[01/22 13:03:29 lb.utils.events]:  eta: 11:03:12  iteration: 19/37320  consumed_samples: 640  total_loss: 1.415  time: 1.0771 s/iter  data_time: 0.0026 s/iter total_throughput: 3.71 samples/s lr: 7.64e-08

似乎应该需要再调整一下实现的gpt-based模型

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant