bug: start chatglm-6b locally err #926

zhangxinyang97 · 2024-03-05T14:29:34Z

Describe the bug

I executed "TRUST_REMOTE_CODE=True openllm start /usr1/models/chatglm-6b" and model was loaded successfully, swagger was available, but got error using v1/chat/compelitioins,

Traceback (most recent call last):
  File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/starlette/middleware/errors.py", line 164, in __call__
    await self.app(scope, receive, _send)
  File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/starlette/routing.py", line 758, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/starlette/routing.py", line 778, in app
    await route.handle(scope, receive, send)
  File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/starlette/routing.py", line 299, in handle
    await self.app(scope, receive, send)
  File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/starlette/routing.py", line 79, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/starlette/routing.py", line 74, in app
    response = await func(request)
  File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/openllm/entrypoints/openai.py", line 159, in chat_completions
    prompt = llm.tokenizer.apply_chat_template(
  File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/openllm/_llm.py", line 411, in tokenizer
    self.__llm_tokenizer__ = openllm.serialisation.load_tokenizer(self, **self.llm_parameters[-1])
  File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/openllm/serialisation/__init__.py", line 33, in load_tokenizer
    tokenizer = transformers.AutoTokenizer.from_pretrained(bentomodel_fs.getsyspath('/'), trust_remote_code=llm.trust_remote_code, **tokenizer_attrs)
  File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/transformers/models/auto/tokenization_auto.py", line 810, in from_pretrained
    return tokenizer_class.from_pretrained(
  File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 2048, in from_pretrained
    return cls._from_pretrained(
  File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 2287, in _from_pretrained
    tokenizer = cls(*init_inputs, **init_kwargs)
  File "/root/.cache/huggingface/modules/transformers_modules/tokenization_chatglm.py", line 196, in __init__
    super().__init__(
  File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/transformers/tokenization_utils.py", line 367, in __init__
    self._add_tokens(
  File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/transformers/tokenization_utils.py", line 467, in _add_tokens
    current_vocab = self.get_vocab().copy()
  File "/root/.cache/huggingface/modules/transformers_modules/tokenization_chatglm.py", line 248, in get_vocab
    vocab = {self._convert_id_to_token(i): i for i in range(self.vocab_size)}
  File "/root/.cache/huggingface/modules/transformers_modules/tokenization_chatglm.py", line 244, in vocab_size
    return self.sp_tokenizer.num_tokens
AttributeError: 'ChatGLMTokenizer' object has no attribute 'sp_tokenizer'

seems that the transformer version required by openllm and transformer version required by chatglm-b are not compatible.

To reproduce

TRUST_REMOTE_CODE=True openllm start /usr1/models/chatglm-6b

Logs

It is recommended to specify the backend explicitly. Cascading backend might lead to unexpected behaviour.
vLLM is not available. Note that PyTorch backend is not as performant as vLLM and you should always consider using vLLM for production.
🚀Tip: run 'openllm build /usr1/models/chatglm-6b --backend pt --serialization legacy' to create a BentoLLM for '/usr1/models/chatglm-6b'
2024-03-05T22:25:42+0800 [INFO] [cli] Prometheus metrics for HTTP BentoServer from "_service:svc" can be accessed at http://localhost:3000/metrics.
2024-03-05T22:25:43+0800 [INFO] [cli] Starting production HTTP BentoServer from "_service:svc" listening on http://0.0.0.0:3000 (Press CTRL+C to quit)
2024-03-05T22:25:49+0800 [WARNING] [runner:llm-chatglm-runner:1] OpenLLM failed to determine compatible Auto classes to load /usr1/models/chatglm-6b. Falling back to 'AutoModel'.
Tip: Make sure to specify 'AutoModelForCausalLM' or 'AutoModelForSeq2SeqLM' in your 'config.auto_map'. If your model type is yet to be supported, please file an issues on our GitHub        tracker.
2024-03-05T22:25:51+0800 [INFO] [api_server:llm-chatglm-service:16] 10.143.178.153:49909 (scheme=http,method=POST,path=/v1/chat/completions,type=application/jsonl;charset=utf-8,lengt       h=620) (status=500,type=text/plain; charset=utf-8,length=3839) 58.546ms (trace=ccedf8ded80e4d3c5f325e5b67a3562a,span=5478b343dbfe4ef0,sampled=1,service.name=llm-chatglm-service)
2024-03-05T22:25:51+0800 [ERROR] [api_server:llm-chatglm-service:16] Exception in ASGI application
Traceback (most recent call last):
File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/bentoml/_internal/server/http/traffic.py", line 26, in __call__
await self.app(scope, receive, send)
File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/bentoml/_internal/server/http/instruments.py", line 135, in __call__
await self.app(scope, receive, wrapped_send)
File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/opentelemetry/instrumentation/asgi/__init__.py", line 596, in __call__
await self.app(scope, otel_receive, otel_send)
File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/bentoml/_internal/server/http/access.py", line 126, in __call__
await self.app(scope, receive, wrapped_send)
File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
raise exc
File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
await app(scope, receive, sender)
File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/starlette/routing.py", line 758, in __call__
await self.middleware_stack(scope, receive, send)
File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/starlette/routing.py", line 778, in app
await route.handle(scope, receive, send)
File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/starlette/routing.py", line 487, in handle
await self.app(scope, receive, send)
File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/starlette/applications.py", line 123, in __call__
await self.middleware_stack(scope, receive, send)
File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/starlette/middleware/errors.py", line 186, in __call__
raise exc
File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/starlette/middleware/errors.py", line 164, in __call__
await self.app(scope, receive, _send)
File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
raise exc
File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
await app(scope, receive, sender)
File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/starlette/routing.py", line 758, in __call__
await self.middleware_stack(scope, receive, send)
File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/starlette/routing.py", line 778, in app
await route.handle(scope, receive, send)
File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/starlette/routing.py", line 299, in handle
await self.app(scope, receive, send)
File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/starlette/routing.py", line 79, in app
await wrap_app_handling_exceptions(app, request)(scope, receive, send)
File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
raise exc
File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
await app(scope, receive, sender)
File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/starlette/routing.py", line 74, in app
response = await func(request)
File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/openllm/entrypoints/openai.py", line 159, in chat_completions
prompt = llm.tokenizer.apply_chat_template(
File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/openllm/_llm.py", line 411, in tokenizer
self.__llm_tokenizer__ = openllm.serialisation.load_tokenizer(self, **self.llm_parameters[-1])
File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/openllm/serialisation/__init__.py", line 33, in load_tokenizer
tokenizer = transformers.AutoTokenizer.from_pretrained(bentomodel_fs.getsyspath('/'), trust_remote_code=llm.trust_remote_code, **tokenizer_attrs)
File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/transformers/models/auto/tokenization_auto.py", line 810, in from_pretrained
return tokenizer_class.from_pretrained(
File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 2048, in from_pretrained
return cls._from_pretrained(
File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 2287, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/tokenization_chatglm.py", line 196, in __init__
super().__init__(
File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/transformers/tokenization_utils.py", line 367, in __init__
self._add_tokens(
File "/opt/buildtools/python-3.9.2/lib/python3.9/site-packages/transformers/tokenization_utils.py", line 467, in _add_tokens
current_vocab = self.get_vocab().copy()
File "/root/.cache/huggingface/modules/transformers_modules/tokenization_chatglm.py", line 248, in get_vocab
vocab = {self._convert_id_to_token(i): i for i in range(self.vocab_size)}
File "/root/.cache/huggingface/modules/transformers_modules/tokenization_chatglm.py", line 244, in vocab_size
return self.sp_tokenizer.num_tokens
AttributeError: 'ChatGLMTokenizer' object has no attribute 'sp_tokenizer'

Environment

Name: bentoml
Version: 1.1.11
Summary: BentoML: Build Production-Grade AI Applications
Home-page: None
Author: None
Author-email: BentoML Team [email protected]
License: Apache-2.0
Location: /opt/buildtools/python-3.9.2/lib/python3.9/site-packages
Requires: prometheus-client, python-json-logger, pathspec, fs, rich, numpy, packaging, opentelemetry-instrumentation, httpx, opentelemetry-util-http, click-option-group, psutil, opentelemetry-api, opentelemetry-instrumentation-aiohttp-client, opentelemetry-semantic-conventions, jinja2, schema, deepmerge, attrs, opentelemetry-sdk, python-dateutil, simple-di, cloudpickle, pip-requirements-parser, uvicorn, requests, aiohttp, watchfiles, circus, python-multipart, inflection, click, pip-tools, pyyaml, starlette, cattrs, nvidia-ml-py, opentelemetry-instrumentation-asgi
Required-by: openllm

Name: transformers
Version: 4.38.2
Summary: State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow
Home-page: https://github.com/huggingface/transformers
Author: The Hugging Face team (past and future) with the help of all our contributors (https://github.com/huggingface/transformers/graphs/contributors)
Author-email: [email protected]
License: Apache 2.0 License
Location: /opt/buildtools/python-3.9.2/lib/python3.9/site-packages
Requires: requests, numpy, regex, huggingface-hub, filelock, safetensors, packaging, tokenizers, pyyaml, tqdm
Required-by: optimum, openllm

Python 3.9.2

System information (Optional)

No response

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: start chatglm-6b locally err #926

bug: start chatglm-6b locally err #926

zhangxinyang97 commented Mar 5, 2024

bug: start chatglm-6b locally err #926

bug: start chatglm-6b locally err #926

Comments

zhangxinyang97 commented Mar 5, 2024

Describe the bug

To reproduce

Logs

Environment

System information (Optional)