Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG/Help] <title>加载cpm_kernels时路径指向python.exe提示[WinError 267] 目录名称无效 #1454

Open
1 task done
SlimeMark opened this issue Feb 1, 2024 · 1 comment

Comments

@SlimeMark
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

使用量化方式加载模型时,提示
Failed to load cpm_kernels:[WinError 267] 目录名称无效。: 'C:\\Users\\Hengj\\AppData\\Local\\Programs\\Python\\Python310\\python.exe'

使用Gradio启动时的完整输出:

Loading checkpoint shards:   0%|                                                                 | 0/7 [00:00<?, _?it/s]C:\Users\Hengj\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\_utils.py:831:_ UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  return self.fget.__get__(instance, owner)()
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████| 7/7 [00:08<00:00,  1.17s/it]
Failed to load cpm_kernels:[WinError 267] 目录名称无效。: 'C:\\Users\\Hengj\\AppData\\Local\\Programs\\Python\\Python310\\python.exe'
Traceback (most recent call last):
  File "E:\ChatGLM3\basic_demo\web_demo_gradio.py", line 29, in <module>
    model = AutoModel.from_pretrained("E:\ChatGLM3", trust_remote_code=True, device_map="auto").quantize(4).cuda()
  File "C:\Users\Hengj\.cache\huggingface\modules\transformers_modules\ChatGLM3\modeling_chatglm.py", line 1208, in quantize
    self.transformer.encoder = quantize(self.transformer.encoder, bits, empty_init=empty_init, device=device,
  File "C:\Users\Hengj\.cache\huggingface\modules\transformers_modules\ChatGLM3\quantization.py", line 155, in quantize
    layer.self_attention.query_key_value = QuantizedLinear(
  File "C:\Users\Hengj\.cache\huggingface\modules\transformers_modules\ChatGLM3\quantization.py", line 139, in __init__
    self.weight = compress_int4_weight(self.weight)
  File "C:\Users\Hengj\.cache\huggingface\modules\transformers_modules\ChatGLM3\quantization.py", line 76, in compress_int4_weight
    blockDim = (min(round_up(m, 32), 1024), 1, 1)
NameError: name 'round_up' is not defined

使用streamlit启动时的完整输出:

Loading checkpoint shards:   0%|                                                                 | 0/7 [00:00<?, ?it/s]C:\Users\Hengj\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\_utils.py:831: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  return self.fget.__get__(instance, owner)()
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████| 7/7 [00:04<00:00,  1.55it/s]
Failed to load cpm_kernels:[WinError 267] 目录名称无效。: 'C:\\Users\\Hengj\\AppData\\Local\\Programs\\Python\\Python310\\python.exe'
2024-02-02 00:41:57.327 Uncaught app exception
Traceback (most recent call last):
  File "C:\Users\Hengj\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\caching\cache_utils.py", line 264, in _get_or_create_cached_value
    cached_result = cache.read_result(value_key)
  File "C:\Users\Hengj\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\caching\cache_resource_api.py", line 498, in read_result
    raise CacheKeyNotFoundError()
streamlit.runtime.caching.cache_errors.CacheKeyNotFoundError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\Hengj\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\caching\cache_utils.py", line 312, in _handle_cache_miss
    cached_result = cache.read_result(value_key)
  File "C:\Users\Hengj\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\caching\cache_resource_api.py", line 498, in read_result
    raise CacheKeyNotFoundError()
streamlit.runtime.caching.cache_errors.CacheKeyNotFoundError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\Hengj\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 535, in _run_script
    exec(code, module.__dict__)
  File "E:\ChatGLM3\basic_demo\web_demo_streamlit.py", line 37, in <module>
    tokenizer, model = get_model()
  File "C:\Users\Hengj\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\caching\cache_utils.py", line 212, in wrapper
    return cached_func(*args, **kwargs)
  File "C:\Users\Hengj\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\caching\cache_utils.py", line 241, in __call__
    return self._get_or_create_cached_value(args, kwargs)
  File "C:\Users\Hengj\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\caching\cache_utils.py", line 267, in _get_or_create_cached_value
    return self._handle_cache_miss(cache, value_key, func_args, func_kwargs)
  File "C:\Users\Hengj\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\caching\cache_utils.py", line 321, in _handle_cache_miss
    computed_value = self._info.func(*func_args, **func_kwargs)
  File "E:\ChatGLM3\basic_demo\web_demo_streamlit.py", line 32, in get_model
    model = AutoModel.from_pretrained("E:\ChatGLM3", trust_remote_code=True, device_map="cuda").quantize(4).cuda()
  File "C:\Users\Hengj\.cache\huggingface\modules\transformers_modules\ChatGLM3\modeling_chatglm.py", line 1208, in quantize
    self.transformer.encoder = quantize(self.transformer.encoder, bits, empty_init=empty_init, device=device,
  File "C:\Users\Hengj\.cache\huggingface\modules\transformers_modules\ChatGLM3\quantization.py", line 155, in quantize
    layer.self_attention.query_key_value = QuantizedLinear(
  File "C:\Users\Hengj\.cache\huggingface\modules\transformers_modules\ChatGLM3\quantization.py", line 139, in __init__
    self.weight = compress_int4_weight(self.weight)
  File "C:\Users\Hengj\.cache\huggingface\modules\transformers_modules\ChatGLM3\quantization.py", line 76, in compress_int4_weight
    blockDim = (min(round_up(m, 32), 1024), 1, 1)
NameError: name 'round_up' is not defined

Expected Behavior

目录不应指向具体文件

Steps To Reproduce

1.完整安装依赖并下载好模型,不使用量化方式,可正常运行

2.修改web_demo_gradio.py的第29行为
model = AutoModel.from_pretrained("E:\ChatGLM3", trust_remote_code=True, device_map="auto").quantize(4).cuda(),或
修改web_demo_streamlit.py的第32行为
model = AutoModel.from_pretrained("E:\ChatGLM3", trust_remote_code=True, device_map="cuda").quantize(4).cuda()

3.在basic_demo目录下运行python web_demo_gradio.pystreamlit run web_demo_streamlit.py

4.Loading checkpoint shards完成后控制台出现所描述错误信息

Environment

- OS:Windows 11 23H2 (22631.3085)
- Python:3.10.6
- Transformers:4.37.1
- PyTorch:2.1.1+cu118
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) : True

Anything else?

No response

@Taobiao03
Copy link

I have the same problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants