You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for releasing Llama Guard 2, it looks like a very promising model!
I was wondering if it would be feasible to release precision/recall curves or numbers by harm category, for your internal benchmark evaluation? Or is there any hope of publicly releasing a small labeled test set for the community to evaluate for ourselves?
From Table 2 in the model card, it looks like a classification threshold of 0.5 results in a rather high FNRs for some categories and I'd like to use a classification threshold with more balanced errors, but am not sure how to go about tuning it myself because the new MLCommons harm taxonomy doesn't map 1:1 with public content classification datasets like OpenAI's moderation dataset.
The text was updated successfully, but these errors were encountered:
Thank you for releasing Llama Guard 2, it looks like a very promising model!
I was wondering if it would be feasible to release precision/recall curves or numbers by harm category, for your internal benchmark evaluation? Or is there any hope of publicly releasing a small labeled test set for the community to evaluate for ourselves?
From Table 2 in the model card, it looks like a classification threshold of 0.5 results in a rather high FNRs for some categories and I'd like to use a classification threshold with more balanced errors, but am not sure how to go about tuning it myself because the new MLCommons harm taxonomy doesn't map 1:1 with public content classification datasets like OpenAI's moderation dataset.
The text was updated successfully, but these errors were encountered: