Skip to content

Commit

Permalink
Merge pull request #157 from eltociear/patch-1
Browse files Browse the repository at this point in the history
Update README_en.md
  • Loading branch information
kaisugi committed Jan 25, 2024
2 parents 64941f0 + 866e277 commit 33f70e7
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions README_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -293,7 +293,7 @@ Ranking based on model answers to [40 open-ended questions](https://huggingface.

#### [ELYZA-tasks-100](https://huggingface.co/datasets/elyza/ELYZA-tasks-100) (ELYZA)

Ranking based on model responses to [100 complex and diverse tasks](https://huggingface.co/datasets/elyza/ELYZA-tasks-100), including tasks testing summarization, correction, abstraction, induction, and other skills. Uses humans to score the model responses and then ranks models based on their mean scores. Evaluation results can be found [here](https://docs.google.com/spreadsheets/d/1mtoy4QAqDPk2f_B0vDogFoOrbA5G42DBEEHdqM4VmDI/edit#gid=1023787356) and [here](https://zenn.dev/elyza/articles/5e7d9373c32a98). For a evaluation containing newer models, see [here](https://note.com/elyza/n/n5d42686b60b7).
Ranking based on model responses to [100 complex and diverse tasks](https://huggingface.co/datasets/elyza/ELYZA-tasks-100), including tasks testing summarization, correction, abstraction, induction, and other skills. Uses humans to score the model responses and then ranks models based on their mean scores. Evaluation results can be found [here](https://docs.google.com/spreadsheets/d/1mtoy4QAqDPk2f_B0vDogFoOrbA5G42DBEEHdqM4VmDI/edit#gid=1023787356) and [here](https://zenn.dev/elyza/articles/5e7d9373c32a98). For an evaluation containing newer models, see [here](https://note.com/elyza/n/n5d42686b60b7).

#### [Japanese Vicuna QA Benchmark](https://github.com/ku-nlp/ja-vicuna-qa-benchmark) (Kyoto University Language Media Processing Lab)

Expand Down Expand Up @@ -401,4 +401,4 @@ If you find this resource useful, please consider citing it:

[^11]: See "[Japanese MiniGPT-4: rinna 3.6bとBLIP-2を組み合わせてマルチモーダルチャットのモデルを作る](https://zenn.dev/rinna/articles/5fad41e3f2a401)" for further details. Note the article discusses using rinna/japanese-gpt-neox-3.6b as the LLM component rather than the rinna/bilingual-gpt-neox-4b model that MiniGPT-4 actually uses.

[^12]: In Instruction Tuning, because it uses data generated by OpenAI's models, such as GPT-3.5 and GPT-4, for training, there is a possibility that it may violate OpenAI's terms.
[^12]: In Instruction Tuning, because it uses data generated by OpenAI's models, such as GPT-3.5 and GPT-4, for training, there is a possibility that it may violate OpenAI's terms.

0 comments on commit 33f70e7

Please sign in to comment.