Skip to content

evalkit/evalkit

Repository files navigation

EvalKit

The TypeScript LLM Evaluations Library


EvalKit is an open-source library designed for TypeScript developers to evaluate and improve the performance of large language models (LLMs) with confidence. Ensure your AI models are reliable, accurate, and trustworthy.

License

🚀 Features, Metrics and Docs

Click here to navigate to the Official EvalKit Documentation

In the documentation, you can find information on how to use EvalKit, its architecture, including tutorials and recipes for various use cases and LLM providers.

Feature Availability Docs
Bias Detection Metric 🔗
Coherence Metric 🔗
Dynamic Metric (G-Eval) 🔗
Faithfulness Metric 🔗
Hallucination Metric 🔗
Intent Detection Metric 🔗
Semantic Similarity Metric 🔗
Semantic Similarity Metric 🔗
Reporting 🚧 🚧

Looking for a metric/feature that's not listed here? Open an issue and let us know!

Getting Started - Quickstart

Prerequisites

  • Node.js 18+
  • OpenAI API Key

Installation

EvalKit currently exports a core package that includes all evaluation related functionalities. Install the package by running the following command:

npm install --save-dev @evalkit/core

Contributing

We welcome contributions from the community! Please feel free to submit pull requests or create issues for bugs or feature suggestions.

License

This repository's source code is available under the Apache 2.0 License.