evals

Here are 6 public repositories matching this topic...

langfuse / langfuse

🪢 Open source LLM engineering platform: Observability, metrics, evals, prompt management, playground, datasets. Integrates with LlamaIndex, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23

open-source playground monitoring analytics evaluation self-hosted ycombinator openai gpt observability large-language-models llm prompt-engineering langchain llmops llama-index prompt-management evals llm-evaluation

Updated May 16, 2024
TypeScript

AgentOps-AI / agentops

Star

Open source Python SDK for agent monitoring, LLM cost tracking, benchmarking, and more. Integrates with most LLMs and agent frameworks like CrewAI, Langchain, and Autogen

agent ai openai evaluation-metrics mistral cost-estimation autogen groq agentops llm langchain anthropic evals ollama crewai

Updated May 16, 2024
Python

nstankov-bg / oaievals-collector

Star

The OAIEvals Collector: A robust, Go-based metric collector for EVALS data. Supports Kafka, Elastic, Loki, InfluxDB, TimescaleDB integrations, and containerized deployment with Docker. Streamlines OAI-Evals data management efficiently with a low barrier of entry!

go docker devops openai chatgpt evals

Updated Oct 26, 2023
Go

zeus-fyi / mockingbird

Star

Mockingbird Front End Code | Zeus + SciFi = Power of the gods (cloud + ai | Zeus) Meets the power of SciFi (human ingenuity | SfYi) At the intersection of intelligent design (systems engineering excellence) For your intelligence —ZeusFYI.

react redux control ai evals