Precision/Recall Curves #34

normster · 2024-05-11T00:10:55Z

Thank you for releasing Llama Guard 2, it looks like a very promising model!

I was wondering if it would be feasible to release precision/recall curves or numbers by harm category, for your internal benchmark evaluation? Or is there any hope of publicly releasing a small labeled test set for the community to evaluate for ourselves?

From Table 2 in the model card, it looks like a classification threshold of 0.5 results in a rather high FNRs for some categories and I'd like to use a classification threshold with more balanced errors, but am not sure how to go about tuning it myself because the new MLCommons harm taxonomy doesn't map 1:1 with public content classification datasets like OpenAI's moderation dataset.

SimonWan added the Llama-Guard Attach this label if the issue is related to the Llama Guard codebase. label May 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Precision/Recall Curves #34

Precision/Recall Curves #34

normster commented May 11, 2024

Precision/Recall Curves #34

Precision/Recall Curves #34

Comments

normster commented May 11, 2024