Skip to content

Commit

Permalink
merge dev_allinone
Browse files Browse the repository at this point in the history
  • Loading branch information
imClumsyPanda committed Aug 17, 2023
2 parents a97cf02 + d9f74ec commit 4fb2e21
Show file tree
Hide file tree
Showing 12 changed files with 651 additions and 117 deletions.
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@ logs
.idea/
__pycache__/
knowledge_base/
configs/model_config.py
configs/*.py
91 changes: 23 additions & 68 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -181,9 +181,11 @@ $ git clone https://huggingface.co/moka-ai/m3e-base

### 3. 设置配置项

复制文件 [configs/model_config.py.example](configs/model_config.py.example) 存储至项目路径下 `./configs` 路径下,并重命名为 `model_config.py`
复制模型相关参数配置模板文件 [configs/model_config.py.example](configs/model_config.py.example) 存储至项目路径下 `./configs` 路径下,并重命名为 `model_config.py`

在开始执行 Web UI 或命令行交互前,请先检查 `configs/model_config.py` 中的各项模型参数设计是否符合需求:
复制服务相关参数配置模板文件 [configs/server_config.py.example](configs/server_config.py.example) 存储至项目路径下 `./configs` 路径下,并重命名为 `server_config.py`

在开始执行 Web UI 或命令行交互前,请先检查 `configs/model_config.py``configs/server_config.py` 中的各项模型参数设计是否符合需求:

- 请确认已下载至本地的 LLM 模型本地存储路径写在 `llm_model_dict` 对应模型的 `local_model_path` 属性中,如:

Expand Down Expand Up @@ -214,7 +216,6 @@ embedding_model_dict = {
```shell
$ python init_database.py
```

- 如果您是第一次运行本项目,知识库尚未建立,或者配置文件中的知识库类型、嵌入模型发生变化,需要以下命令初始化或重建知识库:

```shell
Expand Down Expand Up @@ -244,6 +245,7 @@ $ python server/llm_api.py
```

项目支持多卡加载,需在 llm_api.py 中修改 create_model_worker_app 函数中,修改如下三个参数:

```python
gpus=None,
num_gpus=1,
Expand All @@ -258,7 +260,7 @@ max_gpu_memory="20GiB"

##### 5.1.2 基于命令行脚本 llm_api_launch.py 启动 LLM 服务

⚠️ **注意:**
⚠️ **注意:**

**1.llm_api_launch.py脚本原生仅适用于linux,mac设备需要安装对应的linux命令,win平台请使用wls;**

Expand All @@ -275,11 +277,13 @@ $ python server/llm_api_launch.py
```shell
$ python server/llm_api_launch.py --model-path-address model1@host1@port1 model2@host2@port2
```

如果出现server端口占用情况,需手动指定server端口,并同步修改model_config.py下对应模型的base_api_url为指定端口:

```shell
$ python server/llm_api_launch.py --server-port 8887
```

如果要启动多卡加载,示例命令如下:

```shell
Expand Down Expand Up @@ -354,7 +358,6 @@ $ streamlit run webui.py --server.port 666
- Web UI 对话界面:

![](img/webui_0813_0.png)

- Web UI 知识库管理页面:

![](img/webui_0813_1.png)
Expand All @@ -363,86 +366,38 @@ $ streamlit run webui.py --server.port 666

### 6. 一键启动

⚠️ **注意:**

**1. 一键启动脚本仅原生适用于Linux,Mac 设备需要安装对应的linux命令, Winodws 平台请使用 WLS;**

**2. 加载非默认模型需要用命令行参数 `--model-path-address` 指定模型,不会读取 `model_config.py` 配置。**

#### 6.1 API 服务一键启动脚本

新增 API 一键启动脚本,可一键开启 FastChat 后台服务及本项目提供的 API 服务,调用示例:

调用默认模型:

```shell
$ python server/api_allinone.py
```

加载多个非默认模型:

```shell
$ python server/api_allinone.py --model-path-address model1@host1@port1 model2@host2@port2
```

如果出现server端口占用情况,需手动指定server端口,并同步修改model_config.py下对应模型的base_api_url为指定端口:

```shell
$ python server/api_allinone.py --server-port 8887
```

多卡启动:

```shell
python server/api_allinone.py --model-path-address model@host@port --num-gpus 2 --gpus 0,1 --max-gpu-memory 10GiB
```

其他参数详见各脚本及 FastChat 服务说明。

#### 6.2 webui一键启动脚本

加载本地模型:
更新一键启动脚本 startup.py,一键启动所有 Fastchat 服务、API 服务、WebUI 服务,示例代码:

```shell
$ python webui_allinone.py
$ python startup.py --all-webui
```

调用远程 API 服务:
并可使用 `Ctrl + C` 直接关闭所有运行服务。

```shell
$ python webui_allinone.py --use-remote-api
```
如果出现server端口占用情况,需手动指定server端口,并同步修改model_config.py下对应模型的base_api_url为指定端口:
可选参数包括 `--all-webui`, `--all-api`, `--llm-api`, `--controller`, `--openai-api`,
`--model-worker`, `--api`, `--webui`,其中:

```shell
$ python webui_allinone.py --server-port 8887
```
- `--all-webui` 为一键启动 WebUI 所有依赖服务;

后台运行webui服务:
- `--all-api` 为一键启动 API 所有依赖服务;

```shell
$ python webui_allinone.py --nohup
```
- `--llm-api` 为一键启动 Fastchat 所有依赖的 LLM 服务;

加载多个非默认模型:
- `--openai-api` 为仅启动 FastChat 的 controller 和 openai-api-server 服务;

```shell
$ python webui_allinone.py --model-path-address model1@host1@port1 model2@host2@port2
```
- 其他为单独服务启动选项。

多卡启动
若想指定非默认模型,需要用 `--model-name` 选项,示例

```shell
$ python webui_alline.py --model-path-address model@host@port --num-gpus 2 --gpus 0,1 --max-gpu-memory 10GiB
$ python startup.py --all-webui --model-name Qwen-7B-Chat
```

其他参数详见各脚本及 Fastchat 服务说明。
**注意:**

上述两个一键启动脚本会后台运行多个服务,如要停止所有服务,可使用 `shutdown_all.sh` 脚本:
**1. startup 脚本用多进程方式启动各模块的服务,可能会导致打印顺序问题,请等待全部服务发起后再调用,并根据默认或指定端口调用服务(默认 LLM API 服务端口:`127.0.0.1:8888`,默认 API 服务端口:`127.0.0.1:7861`,默认 WebUI 服务端口:`本机IP:8501`)**

```shell
bash shutdown_all.sh
```
**2.服务启动时间示设备不同而不同,约 3-10 分钟,如长时间没有启动请前往 `./logs`目录下监控日志,定位问题。**

## 常见问题

Expand Down
5 changes: 4 additions & 1 deletion configs/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1,4 @@
from .model_config import *
from .model_config import *
from .server_config import *

VERSION = "v0.2.1-preview"
7 changes: 2 additions & 5 deletions configs/model_config.py.example
Original file line number Diff line number Diff line change
@@ -1,14 +1,11 @@
import os
import logging
import torch
import argparse
import json
# 日志格式
LOG_FORMAT = "%(asctime)s - %(filename)s[line:%(lineno)d] - %(levelname)s: %(message)s"
logger = logging.getLogger()
logger.setLevel(logging.INFO)
logging.basicConfig(format=LOG_FORMAT)
import json


# 在以下字典中修改属性值,以指定本地embedding模型存储位置
Expand Down Expand Up @@ -52,13 +49,13 @@ llm_model_dict = {

"chatglm2-6b": {
"local_model_path": "THUDM/chatglm2-6b",
"api_base_url": "http://localhost:8888/v1", # "name"修改为fastchat服务中的"api_base_url"
"api_base_url": "http://localhost:8888/v1", # URL需要与运行fastchat服务端的server_config.FSCHAT_OPENAI_API一致
"api_key": "EMPTY"
},

"chatglm2-6b-32k": {
"local_model_path": "THUDM/chatglm2-6b-32k", # "THUDM/chatglm2-6b-32k",
"api_base_url": "http://localhost:8888/v1", # "name"修改为fastchat服务中的"api_base_url"
"api_base_url": "http://localhost:8888/v1", # "URL需要与运行fastchat服务端的server_config.FSCHAT_OPENAI_API一致
"api_key": "EMPTY"
},

Expand Down
100 changes: 100 additions & 0 deletions configs/server_config.py.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
from .model_config import LLM_MODEL, LLM_DEVICE

# API 是否开启跨域,默认为False,如果需要开启,请设置为True
# is open cross domain
OPEN_CROSS_DOMAIN = False

# 各服务器默认绑定host
DEFAULT_BIND_HOST = "127.0.0.1"

# webui.py server
WEBUI_SERVER = {
"host": DEFAULT_BIND_HOST,
"port": 8501,
}

# api.py server
API_SERVER = {
"host": DEFAULT_BIND_HOST,
"port": 7861,
}

# fastchat openai_api server
FSCHAT_OPENAI_API = {
"host": DEFAULT_BIND_HOST,
"port": 8888, # model_config.llm_model_dict中模型配置的api_base_url需要与这里一致。
}

# fastchat model_worker server
# 这些模型必须是在model_config.llm_model_dict中正确配置的。
# 在启动startup.py时,可用通过`--model-worker --model-name xxxx`指定模型,不指定则为LLM_MODEL
FSCHAT_MODEL_WORKERS = {
LLM_MODEL: {
"host": DEFAULT_BIND_HOST,
"port": 20002,
"device": LLM_DEVICE,
# todo: 多卡加载需要配置的参数
"gpus": None,
"numgpus": 1,
# 以下为非常用参数,可根据需要配置
# "max_gpu_memory": "20GiB",
# "load_8bit": False,
# "cpu_offloading": None,
# "gptq_ckpt": None,
# "gptq_wbits": 16,
# "gptq_groupsize": -1,
# "gptq_act_order": False,
# "awq_ckpt": None,
# "awq_wbits": 16,
# "awq_groupsize": -1,
# "model_names": [LLM_MODEL],
# "conv_template": None,
# "limit_worker_concurrency": 5,
# "stream_interval": 2,
# "no_register": False,
},
}

# fastchat multi model worker server
FSCHAT_MULTI_MODEL_WORKERS = {
# todo
}

# fastchat controller server
FSCHAT_CONTROLLER = {
"host": DEFAULT_BIND_HOST,
"port": 20001,
"dispatch_method": "shortest_queue",
}


# 以下不要更改
def fschat_controller_address() -> str:
host = FSCHAT_CONTROLLER["host"]
port = FSCHAT_CONTROLLER["port"]
return f"http://{host}:{port}"


def fschat_model_worker_address(model_name: str = LLM_MODEL) -> str:
if model := FSCHAT_MODEL_WORKERS.get(model_name):
host = model["host"]
port = model["port"]
return f"http://{host}:{port}"


def fschat_openai_api_address() -> str:
host = FSCHAT_OPENAI_API["host"]
port = FSCHAT_OPENAI_API["port"]
return f"http://{host}:{port}"


def api_address() -> str:
host = API_SERVER["host"]
port = API_SERVER["port"]
return f"http://{host}:{port}"


def webui_address() -> str:
host = WEBUI_SERVER["host"]
port = WEBUI_SERVER["port"]
return f"http://{host}:{port}"
36 changes: 20 additions & 16 deletions server/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,9 @@

sys.path.append(os.path.dirname(os.path.dirname(__file__)))

from configs.model_config import NLTK_DATA_PATH, OPEN_CROSS_DOMAIN
from configs.model_config import NLTK_DATA_PATH
from configs.server_config import OPEN_CROSS_DOMAIN
from configs import VERSION
import argparse
import uvicorn
from fastapi.middleware.cors import CORSMiddleware
Expand All @@ -14,11 +16,10 @@
from server.knowledge_base.kb_api import list_kbs, create_kb, delete_kb
from server.knowledge_base.kb_doc_api import (list_docs, upload_doc, delete_doc,
update_doc, download_doc, recreate_vector_store,
search_docs, DocumentWithScore)
search_docs, DocumentWithScore)
from server.utils import BaseResponse, ListResponse, FastAPI, MakeFastAPIOffline
from typing import List


nltk.data.path = [NLTK_DATA_PATH] + nltk.data.path


Expand All @@ -27,7 +28,10 @@ async def document():


def create_app():
app = FastAPI(title="Langchain-Chatchat API Server")
app = FastAPI(
title="Langchain-Chatchat API Server",
version=VERSION
)
MakeFastAPIOffline(app)
# Add CORS middleware to allow all origins
# 在config.py中设置OPEN_DOMAIN=True,允许跨域
Expand Down Expand Up @@ -75,10 +79,10 @@ def create_app():
)(create_kb)

app.post("/knowledge_base/delete_knowledge_base",
tags=["Knowledge Base Management"],
response_model=BaseResponse,
summary="删除知识库"
)(delete_kb)
tags=["Knowledge Base Management"],
response_model=BaseResponse,
summary="删除知识库"
)(delete_kb)

app.get("/knowledge_base/list_docs",
tags=["Knowledge Base Management"],
Expand All @@ -87,10 +91,10 @@ def create_app():
)(list_docs)

app.post("/knowledge_base/search_docs",
tags=["Knowledge Base Management"],
response_model=List[DocumentWithScore],
summary="搜索知识库"
)(search_docs)
tags=["Knowledge Base Management"],
response_model=List[DocumentWithScore],
summary="搜索知识库"
)(search_docs)

app.post("/knowledge_base/upload_doc",
tags=["Knowledge Base Management"],
Expand All @@ -99,10 +103,10 @@ def create_app():
)(upload_doc)

app.post("/knowledge_base/delete_doc",
tags=["Knowledge Base Management"],
response_model=BaseResponse,
summary="删除知识库内指定文件"
)(delete_doc)
tags=["Knowledge Base Management"],
response_model=BaseResponse,
summary="删除知识库内指定文件"
)(delete_doc)

app.post("/knowledge_base/update_doc",
tags=["Knowledge Base Management"],
Expand Down
2 changes: 1 addition & 1 deletion server/api_allinone.py → server/api_allinone_stale.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
sys.path.append(os.path.dirname(__file__))
sys.path.append(os.path.dirname(os.path.dirname(__file__)))

from llm_api_launch import launch_all, parser, controller_args, worker_args, server_args
from llm_api_stale import launch_all, parser, controller_args, worker_args, server_args
from api import create_app
import uvicorn

Expand Down
File renamed without changes.

0 comments on commit 4fb2e21

Please sign in to comment.