A framework for few-shot evaluation of language models.
-
Updated
Jun 5, 2024 - Python
A framework for few-shot evaluation of language models.
This is the repository of our article published in RecSys 2019 "Are We Really Making Much Progress? A Worrying Analysis of Recent Neural Recommendation Approaches" and of several follow-up studies.
Test your prompts, agents, and RAGs. Use LLM evals to improve your app's quality and catch problems. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.
The LLM Evaluation Framework
LightEval is a lightweight LLM evaluation suite that Hugging Face has been using internally with the recently released LLM data processing library datatrove and LLM training library nanotron.
Evaluation Framework for Dependency Analysis (EFDA)
Python-based tools for pre-, post-processing, validating, and curating spike sorting datasets.
BIRL: Benchmark on Image Registration methods with Landmark validations
Metrics to evaluate the quality of responses of your Retrieval Augmented Generation (RAG) applications.
Expressive is a cross-platform expression parsing and evaluation framework. The cross-platform nature is achieved through compiling for .NET Standard so it will run on practically any platform.
Optical Flow Dataset and Benchmark for Visual Crowd Analysis
Evaluate your biometric verification models literally in seconds.
PySODEvalToolkit: A Python-based Evaluation Toolbox for Salient Object Detection and Camouflaged Object Detection
Open-Source Evaluation for GenAI Application Pipelines
LiDAR SLAM comparison and evaluation framework
Moonshot - A simple and modular tool to evaluate and red-team any LLM application.
Multilingual Large Language Models Evaluation Benchmark
Evaluation suite for large-scale language models.
A research library for automating experiments on Deep Graph Networks
OD-test: A Less Biased Evaluation of Out-of-Distribution (Outlier) Detectors (PyTorch)
Add a description, image, and links to the evaluation-framework topic page so that developers can more easily learn about it.
To associate your repository with the evaluation-framework topic, visit your repo's landing page and select "manage topics."