CLI serving tool #702

rlouf · 2024-02-22T15:52:01Z

In this PR I introduce an outlines command-line interface which allows users to serve locally JSON-structured generation workflows. The workflows consists in a prompt template, a LLM and a Pydantic model. The API's parameters are the prompt template's arguments and it returns a JSON object that respect the JSON Schema implicitly defined by the Pydantic model.

The use of the CLI is as follows:

outlines serve [SCRIPT] [--port 8000] [--name fn]

I am still not sure whether serving should happen via llama.cpp or vLLM. The interface to define the API using Outlines is also not completely defined:

from pydantic import BaseModel
import outlines


fn = outlines.Function("mistralai/Mistral-7B-v0.1")

@fn.prompt
def prompt_template(arg):
    """{arg}}"""
    
@fn.schema
class Schema(BaseModel):
    foo: int
    bar: str

It should also be possible to call the thus defined function from another script.

wang-haoxian · 2024-05-03T21:57:19Z

Hello Rémi,
Nice work!

I just checked Llama cpp's scheme function.
https://github.com/abetlen/llama-cpp-python?tab=readme-ov-file#json-schema-mode
It seems that llama cpp is working on its way of structuring output.

For the sake of efficiency, I am thinking of vllm.
There is an interesting benchmark that I found.
https://www.inferless.com/learn/exploring-llms-speed-benchmarks-independent-analysis---part-2

From the PoV of scalability, I think vllm is a good choice because:

it's highly optimized for efficiency
we can leverage Triton to use vllm

I will be glad to contribute.

rlouf added the enhancement label Feb 22, 2024

rlouf added 2 commits February 22, 2024 22:08

WIP - Update the Function class

96167b7

Add CLI wireframe

fb98199

rlouf force-pushed the cli-serving-tool branch from acdb206 to fb98199 Compare February 22, 2024 21:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLI serving tool #702

CLI serving tool #702

rlouf commented Feb 22, 2024 •

edited

wang-haoxian commented May 3, 2024

CLI serving tool #702

Are you sure you want to change the base?

CLI serving tool #702

Conversation

rlouf commented Feb 22, 2024 • edited

wang-haoxian commented May 3, 2024

rlouf commented Feb 22, 2024 •

edited