language

license

datasets

model-index

en

llama2

ehartford/WizardLM_evol_instruct_V2_196k_unfiltered_merged_split

name

results

WizardLM-1.0-Uncensored-Llama2-13b

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

AI2 Reasoning Challenge (25-Shot)

ai2_arc

ARC-Challenge

test

num_few_shot
25

type	value	name
acc_norm	55.72	normalized accuracy

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ehartford/WizardLM-1.0-Uncensored-Llama2-13b	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

split

args

HellaSwag (10-Shot)

hellaswag

validation

num_few_shot
10

type	value	name
acc_norm	80.34	normalized accuracy

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ehartford/WizardLM-1.0-Uncensored-Llama2-13b	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

MMLU (5-Shot)

cais/mmlu

all

test

num_few_shot
5

type	value	name
acc	55.4	accuracy

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ehartford/WizardLM-1.0-Uncensored-Llama2-13b	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

TruthfulQA (0-shot)

truthful_qa

multiple_choice

validation

num_few_shot
0

type	value
mc2	51.44

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ehartford/WizardLM-1.0-Uncensored-Llama2-13b	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

Winogrande (5-shot)

winogrande

winogrande_xl

validation

num_few_shot
5

type	value	name
acc	74.66	accuracy

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ehartford/WizardLM-1.0-Uncensored-Llama2-13b	Open LLM Leaderboard

task

dataset

metrics

source

type	name
text-generation	Text Generation

name

type

config

split

args

GSM8k (5-shot)

gsm8k

main

test

num_few_shot
5

type	value	name
acc	13.27	accuracy

url	name
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=ehartford/WizardLM-1.0-Uncensored-Llama2-13b	Open LLM Leaderboard

This is a retraining of https://huggingface.co/WizardLM/WizardLM-13B-V1.0 with a filtered dataset, intended to reduce refusals, avoidance, and bias.

Note that LLaMA itself has inherent ethical beliefs, so there's no such thing as a "truly uncensored" model. But this model will be more compliant than WizardLM/WizardLM-13B-V1.0.

Shout out to the open source AI/ML community, and everyone who helped me out.

Note: An uncensored model has no guardrails. You are responsible for anything you do with the model, just as you are responsible for anything you do with any dangerous object such as a knife, gun, lighter, or car. Publishing anything this model generates is the same as publishing it yourself. You are responsible for the content you publish, and you cannot blame the model any more than you can blame the knife, gun, lighter, or car for what you do with it.

Like WizardLM/WizardLM-13B-V1.0, this model is trained with Vicuna-1.1 style prompts.

You are a helpful AI assistant.

USER: <prompt>
ASSISTANT:

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	49.31
ARC (25-shot)	55.72
HellaSwag (10-shot)	80.34
MMLU (5-shot)	55.4
TruthfulQA (0-shot)	51.44
Winogrande (5-shot)	74.66
GSM8K (5-shot)	13.27
DROP (3-shot)	14.35

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric	Value
Avg.	55.14
AI2 Reasoning Challenge (25-Shot)	55.72
HellaSwag (10-Shot)	80.34
MMLU (5-Shot)	55.40
TruthfulQA (0-shot)	51.44
Winogrande (5-shot)	74.66
GSM8k (5-shot)	13.27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Open LLM Leaderboard Evaluation Results

Open LLM Leaderboard Evaluation Results

Files

README.md

Latest commit

History

README.md

File metadata and controls

Open LLM Leaderboard Evaluation Results

Open LLM Leaderboard Evaluation Results