#

lm-evaluation

Here are 2 public repositories matching this topic...

IAAR-Shanghai / xFinder

xFinder: Robust and Pinpoint Answer Extraction for Large Language Models

benchmark regex reliability evaluation dataset gpt large-language-models llm open-compass lm-evaluation xfinder reliable-evaluation key-answer-extraction

Updated May 31, 2024
Python

hitz-zentroa / latxa

Latxa: An Open Language Model and Evaluation Suite for Basque

evaluation language-model basque huggingface gpt-neox llm lm-evaluation latxa

Updated May 15, 2024
Shell

Improve this page

Add a description, image, and links to the lm-evaluation topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the lm-evaluation topic, visit your repo's landing page and select "manage topics."