How to use

中文说明

Whether it's due to network access restrictions or data security reasons, we may need to deploy large language models (LLMs) privately in order to run access locally.

This project provides a quick way to build a private large language model server, which only requires a single line of commands, you can build a private large language model server locally, and provide an OpenAI-compatible interface.

Note: This project can also be used in a CPU environment, but the speed will be slower.

How to use

1. Install dependencies

First, make sure you have Python installed on your machine (I'm using 3.10)
Then, install the dependencies

pip install -r requirements.txt

2. Download the model

This project is based on FastChat, which supports multiple large language models.

Personally, I only tested the LLM model THUDM/ChatGLM3-6B and the Embedding model BAAI/bge-large-en, other models can theoretically be used as well.

git lfs install
git clone https://huggingface.co/THUDM/chatglm3-6b
git clone https://huggingface.co/BAAI/bge-large-zh

3. configration

This project can deploy multiple models at the same time, just need to configure the model name and path key-value pair inconfig.py.

WORK_CONFIG = {
    "host": HOST,    
    "port": 21002,
    # Model name and path key-value pairs
    "models": {
        # The name can be customized, and the path can be relative or absolute
        "ChatModel":"d:/chatglm3-6b", 
        "EmbeddingsModel":"./models/bge-large-zh", 
    },    
}

4. Start the service

python startup.py

When you see the following output, the service has been started successfully:

Local-LLM-Server is successfully started, please use http://127.0.0.1:21000 to access the OpenAI interface

Usage examples

The sample code is stored in the demos directory.

1. python

import openai

openai.api_key = "Empty"
openai.base_url = "http://localhost:21000/v1/"

# Use the LLM model
completion = openai.chat.completions.create(
    model="ChatModel",
    messages=[{"role": "user", "content": "Tell us about yourself?"}]
)
print(completion.choices[0].message.content)

# Use the Embeddings model
embedding = openai.embeddings.create(
    model="EmbeddingsModel",
    input = "Please star⭐️ this project on GitHub!", 
    encoding_format="float")
print(embedding.data[0].embedding)

2. C#

Requires reference to Nuget Microsoft.SemanticKernel 1.0.1

using Microsoft.SemanticKernel;

var kernel = Kernel.CreateBuilder()
        .AddOpenAIChatCompletion(
             modelId: "ChatModel",
             apiKey: "NoKey",
             httpClient: new HttpClient(new MyHandler())
        ).Build();

var prompt = "Tell us about yourself?";
var result = await kernel.InvokePromptAsync(prompt);
var answer = result.GetValue<string>();
Console.WriteLine(answer);

//Since Microsoft.SemanticKernel does not provide a direct way to set the address of the OpenAI server,
//Therefore, you need to customize a DelegatingHandler and change the OpenAI server address to the Local-LLM-Server address.
class MyHandler : DelegatingHandler
{
    public MyHandler()
        : base(new HttpClientHandler())
    {
    }
    protected override Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
    {
        var newUriBuilder = new UriBuilder(request.RequestUri);
        newUriBuilder.Scheme = "http";
        newUriBuilder.Host = "127.0.0.1";
        newUriBuilder.Port = 21000;

        request.RequestUri = newUriBuilder.Uri;
        return base.SendAsync(request, cancellationToken);
    }
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
demos		demos
img		img
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README.zh-cn.md		README.zh-cn.md
config.py		config.py
requirements.txt		requirements.txt
startup.py		startup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

demos

demos

img

img

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

README.zh-cn.md

README.zh-cn.md

config.py

config.py

requirements.txt

requirements.txt

startup.py

startup.py

Repository files navigation

How to use

1. Install dependencies

2. Download the model

3. configration

4. Start the service

Usage examples

1. python

2. C#

About

Languages

License

feiyun0112/Local-LLM-Server

Folders and files

Latest commit

History

Repository files navigation

How to use

1. Install dependencies

2. Download the model

3. configration

4. Start the service

Usage examples

1. python

2. C#

About

Topics

Resources

License

Stars

Watchers

Forks

Languages