Skip to content

Latest commit

 

History

History
154 lines (143 loc) · 5.12 KB

File metadata and controls

154 lines (143 loc) · 5.12 KB
language license datasets model-index
en
llama2
ehartford/WizardLM_evol_instruct_V2_196k_unfiltered_merged_split
name results
WizardLM-1.0-Uncensored-Llama2-13b
task dataset metrics source
type name
text-generation
Text Generation
name type config split args
AI2 Reasoning Challenge (25-Shot)
ai2_arc
ARC-Challenge
test
num_few_shot
25
type value name
acc_norm
55.72
normalized accuracy
task dataset metrics source
type name
text-generation
Text Generation
name type split args
HellaSwag (10-Shot)
hellaswag
validation
num_few_shot
10
type value name
acc_norm
80.34
normalized accuracy
task dataset metrics source
type name
text-generation
Text Generation
name type config split args
MMLU (5-Shot)
cais/mmlu
all
test
num_few_shot
5
type value name
acc
55.4
accuracy
task dataset metrics source
type name
text-generation
Text Generation
name type config split args
TruthfulQA (0-shot)
truthful_qa
multiple_choice
validation
num_few_shot
0
type value
mc2
51.44
task dataset metrics source
type name
text-generation
Text Generation
name type config split args
Winogrande (5-shot)
winogrande
winogrande_xl
validation
num_few_shot
5
type value name
acc
74.66
accuracy
task dataset metrics source
type name
text-generation
Text Generation
name type config split args
GSM8k (5-shot)
gsm8k
main
test
num_few_shot
5
type value name
acc
13.27
accuracy

This is a retraining of https://huggingface.co/WizardLM/WizardLM-13B-V1.0 with a filtered dataset, intended to reduce refusals, avoidance, and bias.

Note that LLaMA itself has inherent ethical beliefs, so there's no such thing as a "truly uncensored" model. But this model will be more compliant than WizardLM/WizardLM-13B-V1.0.

Shout out to the open source AI/ML community, and everyone who helped me out.

Note: An uncensored model has no guardrails. You are responsible for anything you do with the model, just as you are responsible for anything you do with any dangerous object such as a knife, gun, lighter, or car. Publishing anything this model generates is the same as publishing it yourself. You are responsible for the content you publish, and you cannot blame the model any more than you can blame the knife, gun, lighter, or car for what you do with it.

Like WizardLM/WizardLM-13B-V1.0, this model is trained with Vicuna-1.1 style prompts.

You are a helpful AI assistant.

USER: <prompt>
ASSISTANT:

Detailed results can be found here

Metric Value
Avg. 49.31
ARC (25-shot) 55.72
HellaSwag (10-shot) 80.34
MMLU (5-shot) 55.4
TruthfulQA (0-shot) 51.44
Winogrande (5-shot) 74.66
GSM8K (5-shot) 13.27
DROP (3-shot) 14.35

Detailed results can be found here

Metric Value
Avg. 55.14
AI2 Reasoning Challenge (25-Shot) 55.72
HellaSwag (10-Shot) 80.34
MMLU (5-Shot) 55.40
TruthfulQA (0-shot) 51.44
Winogrande (5-shot) 74.66
GSM8k (5-shot) 13.27