Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG/Help] <title>第二个视频运行api.py 一直无法加载,报错如下图 #1461

Open
1 task done
zx0406 opened this issue Feb 22, 2024 · 0 comments
Open
1 task done

Comments

@zx0406
Copy link

zx0406 commented Feb 22, 2024

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a revision is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
No compiled kernel found.
Compiling kernels : C:\Users\User.cache\huggingface\modules\transformers_modules\models--THUDM--chatglm-6b-int4\quantization_kernels_parallel.c
Compiling gcc -O3 -fPIC -pthread -fopenmp -std=c99 C:\Users\User.cache\huggingface\modules\transformers_modules\models--THUDM--chatglm-6b-int4\quantization_kernels_parallel.c -shared -o C:\Users\User.cache\huggingface\modules\transformers_modules\models--THUDM--chatglm-6b-int4\quantization_kernels_parallel.so
Load parallel cpu kernel failed, using default cpu kernel code:
Traceback (most recent call last):
File "C:\Users\User/.cache\huggingface\modules\transformers_modules\models--THUDM--chatglm-6b-int4\quantization.py", line 156, in init
kernels = ctypes.cdll.LoadLibrary(kernel_file)
File "D:\ProgramData\Anaconda3\envs\yolov5\lib\ctypes_init_.py", line 452, in LoadLibrary
return self.dlltype(name)
File "D:\ProgramData\Anaconda3\envs\yolov5\lib\ctypes_init
.py", line 374, in init
self._handle = _dlopen(self._name, mode)
FileNotFoundError: Could not find module 'C:\Users\User.cache\huggingface\modules\transformers_modules\models--THUDM--chatglm-6b-int4\quantization_kernels_parallel.so' (or one of its dependencies). Try using the full path with constructor syntax.

Compiling gcc -O3 -fPIC -std=c99 C:\Users\User.cache\huggingface\modules\transformers_modules\models--THUDM--chatglm-6b-int4\quantization_kernels.c -shared -o C:\Users\User.cache\huggingface\modules\transformers_modules\models--THUDM--chatglm-6b-int4\quantization_kernels.so
Load kernel : C:\Users\User.cache\huggingface\modules\transformers_modules\models--THUDM--chatglm-6b-int4\quantization_kernels.so
Using quantization cache
Applying quantization to glm layers

Expected Behavior

No response

Steps To Reproduce

在 Windows 下加载 INT-4 量化模型,无法正常加载

Environment

- OS:windows10
- Python:3.10
- Transformers: 4.27.1
- PyTorch:2.01
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :

Anything else?

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant