language |
license |
library_name |
datasets |
pipeline_tag |
model-index |
|
apache-2.0 |
transformers |
monology/VMware-open-instruct-higgsfield |
|
text-generation |
name |
results |
openinstruct-mistral-7b |
task |
dataset |
metrics |
source |
type |
name |
text-generation |
Text Generation |
|
name |
type |
config |
split |
args |
AI2 Reasoning Challenge (25-Shot) |
ai2_arc |
ARC-Challenge |
test |
|
|
type |
value |
name |
acc_norm |
59.73 |
normalized accuracy |
|
|
url |
name |
|
Open LLM Leaderboard |
|
|
task |
dataset |
metrics |
source |
type |
name |
text-generation |
Text Generation |
|
name |
type |
split |
args |
HellaSwag (10-Shot) |
hellaswag |
validation |
|
|
type |
value |
name |
acc_norm |
82.77 |
normalized accuracy |
|
|
url |
name |
|
Open LLM Leaderboard |
|
|
task |
dataset |
metrics |
source |
type |
name |
text-generation |
Text Generation |
|
name |
type |
config |
split |
args |
MMLU (5-Shot) |
cais/mmlu |
all |
test |
|
|
type |
value |
name |
acc |
60.55 |
accuracy |
|
|
url |
name |
|
Open LLM Leaderboard |
|
|
task |
dataset |
metrics |
source |
type |
name |
text-generation |
Text Generation |
|
name |
type |
config |
split |
args |
TruthfulQA (0-shot) |
truthful_qa |
multiple_choice |
validation |
|
|
|
url |
name |
|
Open LLM Leaderboard |
|
|
task |
dataset |
metrics |
source |
type |
name |
text-generation |
Text Generation |
|
name |
type |
config |
split |
args |
Winogrande (5-shot) |
winogrande |
winogrande_xl |
validation |
|
|
type |
value |
name |
acc |
79.56 |
accuracy |
|
|
url |
name |
|
Open LLM Leaderboard |
|
|
task |
dataset |
metrics |
source |
type |
name |
text-generation |
Text Generation |
|
name |
type |
config |
split |
args |
GSM8k (5-shot) |
gsm8k |
main |
test |
|
|
type |
value |
name |
acc |
50.49 |
accuracy |
|
|
url |
name |
|
Open LLM Leaderboard |
|
|
|
|
|
1st among commercially-usable 7B models on the Open LLM Leaderboard!*
This is mistralai/Mistral-7B-v0.1 finetuned on VMware/open-instruct.
Quantized to FP16 and released under the Apache-2.0 license by myself.
Compute generously provided by Higgsfield AI.
Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
[your instruction goes here]
### Response:
- temperature: 0.2
- top_k: 50
- top_p 0.95
- repetition_penalty: 1.1
*as of 21 Nov 2023. "commercially-usable" includes both an open-source base model and a non-synthetic open-source finetune dataset. updated leaderboard results available here.
Detailed results can be found here
Metric |
Value |
Avg. |
63.64 |
AI2 Reasoning Challenge (25-Shot) |
59.73 |
HellaSwag (10-Shot) |
82.77 |
MMLU (5-Shot) |
60.55 |
TruthfulQA (0-shot) |
48.76 |
Winogrande (5-shot) |
79.56 |
GSM8k (5-shot) |
50.49 |