Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Unify Support #12921

Merged
merged 28 commits into from May 21, 2024
Merged
Show file tree
Hide file tree
Changes from 24 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
1fd1dfc
unify integration with llama index
aparajith21 Mar 13, 2024
3ef1c51
Merge branch 'run-llama:main' into main
aparajith21 Mar 13, 2024
d8419c8
adding docs
aparajith21 Mar 14, 2024
1ac3fb7
updating notebook
aparajith21 Mar 14, 2024
9656f8a
formatting fixes in notebook
aparajith21 Mar 14, 2024
e2a7ed8
improve examples
aparajith21 Mar 14, 2024
9ef4570
formatting fix
aparajith21 Mar 14, 2024
3fb9f5d
reducing verbosity
aparajith21 Mar 14, 2024
a403cbe
adding live benchmark link
aparajith21 Mar 14, 2024
1290a71
notebook edits
aparajith21 Mar 15, 2024
35a0428
formatting fix
aparajith21 Mar 15, 2024
e4d5501
Update unify.ipynb
djl11 Mar 15, 2024
b433a06
bigger fonts
aparajith21 Mar 15, 2024
ee8706a
tweaking font size in unify notebook
aparajith21 Mar 15, 2024
add938c
minor unify nb edit
aparajith21 Mar 15, 2024
035c1a9
add dependencies
guillesanbri Mar 22, 2024
52c862f
add notebook to index
guillesanbri Mar 22, 2024
3958a55
changes to notebook
guillesanbri Mar 22, 2024
215924b
Merge remote-tracking branch 'upstream/main' into main
guillesanbri Apr 11, 2024
a8e527f
change docs after refactor
guillesanbri Apr 11, 2024
fdadae8
add test
guillesanbri Apr 11, 2024
64aa94f
format + lint
guillesanbri Apr 11, 2024
0f08f2c
add poetry lock
guillesanbri Apr 11, 2024
1877747
improve readme
guillesanbri Apr 11, 2024
2399abf
Fix linting issues
hello-fri-end May 20, 2024
5365790
Fix CI errors
hello-fri-end May 20, 2024
8f7a73a
merge with main
hello-fri-end May 21, 2024
337d18d
fix linting errors
hello-fri-end May 21, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
240 changes: 240 additions & 0 deletions docs/docs/examples/llm/unify.ipynb
@@ -0,0 +1,240 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Unify\n",
"\n",
"[Unify](https://unify.ai/hub) dynamically routes each query to the best LLM, with support for providers such as OpenAI, MistralAI, Perplexity AI, and Together AI. You can also access all providers individually using a single API key.\n",
"\n",
"You can check out our [live benchmarks](https://unify.ai/hub/mixtral-8x7b-instruct-v0.1) to see where the data is coming from!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Installation"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First, let's install LlamaIndex 🦙 and the Unify integration."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%pip install llama-index-llms-unify llama-index"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Environment Setup\n",
"\n",
"Make sure to set the `UNIFY_API_KEY` environment variable. You can get a key in the [Unify Console](https://console.unify.ai/login)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"UNIFY_API_KEY\"] = \"<YOUR API KEY>\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Using LlamaIndex with Unify"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Routing a request\n",
"\n",
"The first thing we can do is initialize and query a chat model. To configure Unify's router, pass an endpoint string to `Unify`. You can read more about this in [Unify's docs](https://unify.ai/docs/hub/concepts/runtime_routing.html).\n",
"\n",
"In this case, we will use the cheapest endpoint for `llama2-70b` in terms of input cost and then use `complete`."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"CompletionResponse(text=\" I'm doing well, thanks for asking! It's always a pleasure to chat with you. I hope you're having a great day too! Is there anything specific you'd like to talk about or ask me? I'm here to help with any questions you might have.\", additional_kwargs={}, raw={'id': 'meta-llama/Llama-2-70b-chat-hf-b90de288-1927-4f32-9ecb-368983c45321', 'choices': [Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content=\" I'm doing well, thanks for asking! It's always a pleasure to chat with you. I hope you're having a great day too! Is there anything specific you'd like to talk about or ask me? I'm here to help with any questions you might have.\", role='assistant', function_call=None, tool_calls=None, tool_call_id=None))], 'created': 1711047739, 'model': 'llama-2-70b-chat@anyscale', 'object': 'chat.completion', 'system_fingerprint': None, 'usage': CompletionUsage(completion_tokens=62, prompt_tokens=16, total_tokens=78, cost=7.8e-05)}, logprobs=None, delta=None)"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from llama_index.llms.unify import Unify\n",
"\n",
"llm = Unify(model=\"llama-2-70b-chat@dinput-cost\")\n",
"llm.complete(\"How are you today, llama?\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Single Sign-On\n",
"\n",
"If you don't want the router to select the provider, you can also use our SSO to query endpoints in different providers without making accounts with all of them. For example, all of these are valid endpoints:"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"llm = Unify(model=\"llama-2-70b-chat@together-ai\")\n",
"llm = Unify(model=\"gpt-3.5-turbo@openai\")\n",
"llm = Unify(model=\"mixtral-8x7b-instruct-v0.1@mistral-ai\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This allows you to quickly switch and test different models and providers. For example, if you are working on an application that uses gpt-4 under the hood, you can use this to query a much cheaper LLM during development and/or testing to reduce costs.\n",
"\n",
"Take a look at the available ones [here](https://unify.ai/hub)!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Streaming and optimizing for latency\n",
"\n",
"If you are building an application where responsiveness is key, you most likely want to get a streaming response. On top of that, ideally you would use the provider with the lowest Time to First Token, to reduce the time your users are waiting for a response. Using Unify this would look something like:"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"llm = Unify(model=\"mixtral-8x7b-instruct-v0.1@ttft\")\n",
"\n",
"response = llm.stream_complete(\n",
" \"Translate the following to German: \"\n",
" \"Hey, there's an emergency in translation street, \"\n",
" \"please send help asap!\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Model and provider are : mixtral-8x7b-instruct-v0.1@mistral-ai\n",
"\n",
"Hallo, es gibt einen Notfall in der Übersetzungsstraße, bitte senden Sie Hilfe so schnell wie möglich!\n",
"\n",
"(Note: This is a literal translation and the term \"Übersetzungsstraße\" is not a standard or commonly used term in German. A more natural way to express the idea of a \"emergency in translation\" could be \"Notfall bei Übersetzungen\" or \"akute Übersetzungsnotwendigkeit\".)"
]
}
],
"source": [
"show_provider = True\n",
"for r in response:\n",
" if show_provider:\n",
" print(f\"Model and provider are : {r.raw['model']}\\n\")\n",
" show_provider = False\n",
" print(r.delta, end=\"\", flush=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Async calls and Lowest Input Cost\n",
"\n",
"Last but not least, you can also run request asynchronously. For tasks like long document summarization, optimizing for input costs is crucial. Unify's dynamic router can do this too!"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Model and provider are : mixtral-8x7b-instruct-v0.1@deepinfra\n",
"\n",
" OpenAI: Pioneering 'safe' artificial general intelligence.\n"
]
}
],
"source": [
"llm = Unify(model=\"mixtral-8x7b-instruct-v0.1@input-cost\")\n",
"\n",
"response = await llm.acomplete(\n",
" \"Summarize this in 10 words or less. OpenAI is a U.S. based artificial intelligence \"\n",
" \"(AI) research organization founded in December 2015, researching artificial intelligence \"\n",
" \"with the goal of developing 'safe and beneficial' artificial general intelligence, \"\n",
" \"which it defines as 'highly autonomous systems that outperform humans at most economically \"\n",
" \"valuable work'. As one of the leading organizations of the AI spring, it has developed \"\n",
" \"several large language models, advanced image generation models, and previously, released \"\n",
" \"open-source models. Its release of ChatGPT has been credited with starting the AI spring\"\n",
")\n",
"\n",
"print(f\"Model and provider are : {response.raw['model']}\\n\")\n",
"print(response)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "base",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.5"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
1 change: 1 addition & 0 deletions docs/docs/module_guides/models/llms/modules.md
Expand Up @@ -50,6 +50,7 @@ We support integrations with OpenAI, Anthropic, Hugging Face, PaLM, and more.
- [SageMaker](../../../examples/llm/sagemaker_endpoint_llm.ipynb)
- [Solar](../../../examples/llm/solar.ipynb)
- [Together.ai](../../../examples/llm/together.ipynb)
- [Unify AI](../../../examples/llm/unify.ipynb)
- [Vertex](../../../examples/llm/vertex.ipynb)
- [vLLM](../../../examples/llm/vllm.ipynb)
- [Xorbits Inference](../../../examples/llm/xinference_local_deployment.ipynb)