OpenAI-compatible API #20

apcameron · 2023-07-18T22:45:19Z

Is it possible to provide an API the mimics the functionality of the OPENAI API?

Vectorrent · 2023-07-20T15:07:49Z

With LLaMA 2 released, it even expects the whole "system, user, assistant" format now...

https://github.com/facebookresearch/llama/blob/6c7fe276574e78057f917549435a2554000a876d/llama/generation.py#L213

borzunov · 2023-07-22T08:20:56Z

Hi @LuciferianInk,

The format is not obligatory but does improve the quality of the model. We'll try moving to the official format to achieve that.

Eclipse-Station · 2023-07-23T17:35:43Z

Support that, quite a bit of backends use OpenAI library, but non OpenAI backends (like Oobabooga's text gen, which has OpenAI API extension). This would help transition from using OAI/3rd party compatible backends a breeze.

borzunov · 2023-08-07T16:29:44Z

@apcameron @Eclipse-Station I agree that this feature would be useful. We'll try to find time to implement it - and pull requests are always welcome!

krrishdholakia · 2023-08-15T03:03:25Z

Hey @borzunov @Eclipse-Station i'm confused - why do you need to mimic the openai I/O format in a local model?

I don't think i saw this repo using openai at all? So what's the advantage

borzunov · 2023-08-15T04:21:46Z

Hi @krrishdholakia,

This repo doesn't use OpenAI API in any sense, but using a similar interface would help with interoperability with existing software.

E.g., one could take an existing chatbot/text generation UI supporting OpenAI API, then replace the API URL to make it work via the Petals swarm without any code changes.

krrishdholakia · 2023-08-15T15:52:23Z

Oh! I think we might be able to help - https://github.com/BerriAI/litellm.

Just created an issue to track this. Hope to get it done today.

def translate_function(model, messages, max_tokens,...):
	prompt = " ".join(message["content"] for message in messages)
	max_new_tokens = max_tokens
	return {"model": model, "prompt": prompt, "max_new_tokens": max_new_tokens} 

openai.api_base = litellm.translate_api_call(custom_api_base, translate_function)

ishaan-jaff · 2023-08-16T04:26:20Z

@apcameron @borzunov We added the ability to call petals.dev using liteLLM OpenAI chatGPT Input/Output, check out this example notebook:
https://github.com/BerriAI/litellm/blob/main/cookbook/liteLLM_Petals.ipynb

borzunov · 2023-08-16T06:56:49Z

@krrishdholakia @ishaan-jaff Thanks for making the integration!

I think @apcameron and @Eclipse-Station want an HTTP API in the OpenAI-compatible format (= one URL) that internally translates API calls to the Petals HTTP/WebSocket API or directly to the Petals swarm. Can litellm help with that?

In any case, we appreciate your work on making Petals available for litellm users!

apcameron · 2023-08-16T12:17:04Z

Yes we do not want to have to change the code of the applications we are using. The OpenAI API we are looking for needs to be transparent to the caller.

krrishdholakia · 2023-08-16T15:11:04Z

@apcameron are you just using OpenAI + Petals?

If you proxy openai, then you also need to deal with all the other openai requests (E.g. embeddings). But it seems like you just want to map the completion endpoint - correct?

In that case wouldn't you want to basically remap openai.ChatCompletion.create?

apcameron · 2023-08-16T18:20:55Z

Yes I think the competition endpoint would be a good start and the option to select the model

jontstaz · 2023-09-23T13:11:13Z

Any updates on this so far? Would be great to be able to use Petals as a drop in replacement for anything using OpenAI's API

krrishdholakia · 2023-09-23T16:44:23Z

here - https://docs.litellm.ai/docs/providers/petals

borzunov · 2023-09-27T08:18:25Z

Hi @apcameron @Eclipse-Station @jontstaz,

Can you share a few examples of apps where OpenAI-compatible API for Petals will be helpful? We hired a part-time dev who may work on this - it would be great to know some apps where we can test this.

@krrishdholakia, it seems like most people requesting this feature can't change the app code (e.g., to remap openai.ChatCompletion.create to LiteLLM). I'll double check that once people share the relevant examples of apps using OpenAI-compatible API endpoints.

krrishdholakia · 2023-09-27T22:14:30Z

Hey @borzunov @jontstaz @apcameron

we actually put out a solution for this - https://docs.litellm.ai/docs/proxy_server

it's a 1-click local proxy, that spins up a local server to map openai completion calls to any litellm supported api (Petals, Huggingface TGI, TogetherAI, etc.)

Here's the cli command

litellm --model petals/petals-team/StableBeluga2

and it'll spin up an openai-compatiable proxy server at port: 8000.

Just set the openai api base to this and it'll start making petals calls

openai.api_base = "http://localhost:8000"

krrishdholakia · 2023-09-27T22:15:12Z

@borzunov let me know if that covers the use-case, if not happy to iterate and land on something that works for the community!

apcameron · 2023-09-29T10:19:14Z

Here are some examples where it would be nice to use an OpenAI-compatible API to point to Petals instead.
https://github.com/paul-gauthier/aider
https://github.com/OpenBMB/ChatDev
https://github.com/AntonOsika/gpt-engineer
https://github.com/microsoft/autogen

apcameron · 2023-09-29T10:33:35Z

Hey @borzunov @jontstaz @apcameron

we actually put out a solution for this - https://docs.litellm.ai/docs/proxy_server

it's a 1-click local proxy, that spins up a local server to map openai completion calls to any litellm supported api (Petals, Huggingface TGI, TogetherAI, etc.)

Here's the cli command
litellm --model petals/petals-team/StableBeluga2
and it'll spin up an openai-compatiable proxy server at port: 8000.

Just set the openai api base to this and it'll start making petals calls
openai.api_base = "http://localhost:8000"

Thank I will try this out in the next few days when I get some time

krrishdholakia · 2023-09-29T15:52:18Z

Added a tutorial for using the 1-click deploy with aider - https://docs.litellm.ai/docs/proxy_server#tutorial---using-with-aider

can do this with petals as well by running this instead of the hf command

litellm --model petals/petals-team/StableBeluga2

jontstaz · 2023-09-30T09:06:38Z

Hey @borzunov @jontstaz @apcameron

we actually put out a solution for this - https://docs.litellm.ai/docs/proxy_server

it's a 1-click local proxy, that spins up a local server to map openai completion calls to any litellm supported api (Petals, Huggingface TGI, TogetherAI, etc.)

Here's the cli command
litellm --model petals/petals-team/StableBeluga2
and it'll spin up an openai-compatiable proxy server at port: 8000.

Just set the openai api base to this and it'll start making petals calls
openai.api_base = "http://localhost:8000"

Perfect! This is exactly what I was looking for. Thanks.

Also FYI there's a new model which apparently performs better than CodeLlama and all other previous code-focused models. It's https://huggingface.co/TheBloke/Phind-CodeLlama-34B-v2-GPTQ

Would be cool to see it on Petals. I'm planning on getting a couple of 4090s relatively soon and then would be able to contribute to Petals with some code-focused models.

softmix · 2024-03-29T20:45:35Z

#50 and #51 would solve this in a proper way

apcameron mentioned this issue Jul 24, 2023

No more extra pay for chatgpt please paul-gauthier/aider#138

Closed

krrishdholakia mentioned this issue Aug 15, 2023

replace openai base with proxy w/ translation BerriAI/litellm#120

Closed

borzunov changed the title ~~openai API?~~ OpenAI-compatible API Aug 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenAI-compatible API #20

OpenAI-compatible API #20

apcameron commented Jul 18, 2023

Vectorrent commented Jul 20, 2023

borzunov commented Jul 22, 2023

Eclipse-Station commented Jul 23, 2023

borzunov commented Aug 7, 2023

krrishdholakia commented Aug 15, 2023

borzunov commented Aug 15, 2023 •

edited

krrishdholakia commented Aug 15, 2023 •

edited

ishaan-jaff commented Aug 16, 2023

borzunov commented Aug 16, 2023 •

edited

apcameron commented Aug 16, 2023

krrishdholakia commented Aug 16, 2023

apcameron commented Aug 16, 2023 •

edited

jontstaz commented Sep 23, 2023

krrishdholakia commented Sep 23, 2023

borzunov commented Sep 27, 2023

krrishdholakia commented Sep 27, 2023 •

edited

krrishdholakia commented Sep 27, 2023

apcameron commented Sep 29, 2023

apcameron commented Sep 29, 2023

krrishdholakia commented Sep 29, 2023

jontstaz commented Sep 30, 2023

softmix commented Mar 29, 2024

OpenAI-compatible API #20

OpenAI-compatible API #20

Comments

apcameron commented Jul 18, 2023

Vectorrent commented Jul 20, 2023

borzunov commented Jul 22, 2023

Eclipse-Station commented Jul 23, 2023

borzunov commented Aug 7, 2023

krrishdholakia commented Aug 15, 2023

borzunov commented Aug 15, 2023 • edited

krrishdholakia commented Aug 15, 2023 • edited

ishaan-jaff commented Aug 16, 2023

borzunov commented Aug 16, 2023 • edited

apcameron commented Aug 16, 2023

krrishdholakia commented Aug 16, 2023

apcameron commented Aug 16, 2023 • edited

jontstaz commented Sep 23, 2023

krrishdholakia commented Sep 23, 2023

borzunov commented Sep 27, 2023

krrishdholakia commented Sep 27, 2023 • edited

krrishdholakia commented Sep 27, 2023

apcameron commented Sep 29, 2023

apcameron commented Sep 29, 2023

krrishdholakia commented Sep 29, 2023

jontstaz commented Sep 30, 2023

softmix commented Mar 29, 2024

borzunov commented Aug 15, 2023 •

edited

krrishdholakia commented Aug 15, 2023 •

edited

borzunov commented Aug 16, 2023 •

edited

apcameron commented Aug 16, 2023 •

edited

krrishdholakia commented Sep 27, 2023 •

edited