[QA] Why is InternLM2-Chat-SFT based on InternLM2-Base instead of InternLM2? #606

underspirit · 2024-01-17T09:48:36Z

underspirit
Jan 17, 2024

Describe the question.

InternLM2 is an enhanced model based on InternLM2-Base, and its capabilities should be better in many domain. Why isn't the subsequent SFT model based on it?

Answered by ZwwWayne

Jan 17, 2024

This is simply due to the time issue. InternLM2 and InternLM2-Chat are trained parallelly due to limited time for release. Furthermore, InternLM2 and InternLM2-Chat are optimized for different capability dimensions as you can see from the evaluation results.

View full answer

ZwwWayne · 2024-01-17T11:43:39Z

ZwwWayne
Jan 17, 2024
Maintainer

This is simply due to the time issue. InternLM2 and InternLM2-Chat are trained parallelly due to limited time for release. Furthermore, InternLM2 and InternLM2-Chat are optimized for different capability dimensions as you can see from the evaluation results.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QA] Why is InternLM2-Chat-SFT based on InternLM2-Base instead of InternLM2? #606

{{title}}

Replies: 1 comment

{{title}}

Select a reply

[QA] Why is InternLM2-Chat-SFT based on InternLM2-Base instead of InternLM2? #606

underspirit Jan 17, 2024

Describe the question.

Replies: 1 comment

ZwwWayne Jan 17, 2024 Maintainer

underspirit
Jan 17, 2024

ZwwWayne
Jan 17, 2024
Maintainer