Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG/Help] <title>复现ptuning微调时出现RuntimeError: "bernoulli_scalar_cpu_" not implemented for 'Half' #1468

Open
1 task done
ysqfirmament opened this issue Mar 23, 2024 · 3 comments

Comments

@ysqfirmament
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

进行微调的时候,尝试复现ADGEN数据集任务,在运行bash train.sh过程中出现此错误

执行

import torch
print(torch.cuda.is_available())

得到的结果为True

C:\Users\firmament\AppData\Roaming\Python\Python310\site-packages\transformers\optimization.py:391: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning
  warnings.warn(
input_ids [5, 65421, 61, 67329, 32, 98339, 61, 72043, 32, 65347, 61, 70872, 32, 69768, 61, 68944, 32, 67329, 64103, 61, 96914, 130001, 130004, 5, 87052, 96914, 81471, 64562, 65759, 64493, 64988, 6, 65840, 65388, 74531, 63825, 75786, 64009, 63823, 65626, 63882, 64619, 65388, 6, 64480, 65604, 85646, 110945, 10, 64089, 65966, 87052, 67329, 65544, 6, 71964, 70533, 64417, 63862, 89978, 63991, 63823, 77284, 88473, 64219, 63848, 112012, 6, 71231, 65099, 71252, 66800, 85768, 64566, 64338, 100323, 75469, 63823, 117317, 64218, 64257, 64051, 74197, 6, 63893, 130005, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3]
inputs ▒▒▒▒#▒▒*▒▒▒▒#▒▒▒▒*▒▒▒#▒Ը▒*ͼ▒▒#▒▒▒▒*▒▒▒▒#▒▒▒ȿ▒ ▒▒▒ɵ▒▒▒▒ȿ▒▒▒▒▒▒▒▒▒▒▒▒▒۲▒▒▒,▒▒▒▒ʱ▒д▒▒˵▒▒▒ͷ▒▒▒▒▒Ͼ▒▒ô▒ʱ▒▒,˭▒▒▒ܴ▒▒▒▒ȳ▒2▒׵▒Ч▒▒▒▒▒ɵĿ▒▒▒,▒▒Ȼ▒▒▒▒▒▒С▒▒▒ְ▒▒▒▒▒▒▒▒▒▒▒▒▒Ȼ▒▒▒▒▒▒,▒▒▒▒▒׷▒▒▒▒▒▒▒▒▒▒▒▒▒а▒▒▒▒ա▒ϵ▒▒▒▒▒▒▒▒▒▒▒▒ƿ▒▒▒,▒▒
label_ids [-100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 130004, 5, 87052, 96914, 81471, 64562, 65759, 64493, 64988, 6, 65840, 65388, 74531, 63825, 75786, 64009, 63823, 65626, 63882, 64619, 65388, 6, 64480, 65604, 85646, 110945, 10, 64089, 65966, 87052, 67329, 65544, 6, 71964, 70533, 64417, 63862, 89978, 63991, 63823, 77284, 88473, 64219, 63848, 112012, 6, 71231, 65099, 71252, 66800, 85768, 64566, 64338, 100323, 75469, 63823, 117317, 64218, 64257, 64051, 74197, 6, 63893, 130005, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100]
labels <image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100> ▒▒▒ɵ▒▒▒▒ȿ▒▒▒▒▒▒▒▒▒▒▒▒▒۲▒▒▒,▒▒▒▒ʱ▒д▒▒˵▒▒▒ͷ▒▒▒▒▒Ͼ▒▒ô▒ʱ▒▒,˭▒▒▒ܴ▒▒▒▒ȳ▒2▒׵▒Ч▒▒▒▒▒ɵĿ▒▒▒,▒▒Ȼ▒▒▒▒▒▒С▒▒▒ְ▒▒▒▒▒▒▒▒▒▒▒▒▒Ȼ▒▒▒▒▒▒,▒▒▒▒▒׷▒▒▒▒▒▒▒▒▒▒▒▒▒а▒▒▒▒ա▒ϵ▒▒▒▒▒▒▒▒▒▒▒▒ƿ▒▒▒,▒▒<image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100><image_-100>
  0%|          | 0/3000 [00:00<?, ?it/s]03/23/2024 23:23:53 - WARNING - transformers_modules.chatglm-6b-int4.modeling_chatglm - `use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...
Traceback (most recent call last):
  File "D:\GLM\ChatGLM-6B-main\ptuning\main.py", line 430, in <module>
    main()
  File "D:\GLM\ChatGLM-6B-main\ptuning\main.py", line 369, in main
    train_result = trainer.train(resume_from_checkpoint=checkpoint)
  File "D:\GLM\ChatGLM-6B-main\ptuning\trainer.py", line 1635, in train
    return inner_training_loop(
  File "D:\GLM\ChatGLM-6B-main\ptuning\trainer.py", line 1904, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs)
  File "D:\GLM\ChatGLM-6B-main\ptuning\trainer.py", line 2647, in training_step
    loss = self.compute_loss(model, inputs)
  File "D:\GLM\ChatGLM-6B-main\ptuning\trainer.py", line 2679, in compute_loss
    outputs = model(**inputs)
  File "C:\Users\firmament\AppData\Roaming\Python\Python310\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "C:\Users\firmament\AppData\Roaming\Python\Python310\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\firmament/.cache\huggingface\modules\transformers_modules\chatglm-6b-int4\modeling_chatglm.py", line 1190, in forward
    transformer_outputs = self.transformer(
  File "C:\Users\firmament\AppData\Roaming\Python\Python310\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "C:\Users\firmament\AppData\Roaming\Python\Python310\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\firmament/.cache\huggingface\modules\transformers_modules\chatglm-6b-int4\modeling_chatglm.py", line 930, in forward
    past_key_values = self.get_prompt(batch_size=input_ids.shape[0], device=input_ids.device,
  File "C:\Users\firmament/.cache\huggingface\modules\transformers_modules\chatglm-6b-int4\modeling_chatglm.py", line 878, in get_prompt
    past_key_values = self.dropout(past_key_values)
  File "C:\Users\firmament\AppData\Roaming\Python\Python310\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "C:\Users\firmament\AppData\Roaming\Python\Python310\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\firmament\AppData\Roaming\Python\Python310\site-packages\torch\nn\modules\dropout.py", line 58, in forward
    return F.dropout(input, self.p, self.training, self.inplace)
  File "C:\Users\firmament\AppData\Roaming\Python\Python310\site-packages\torch\nn\functional.py", line 1266, in dropout
    return _VF.dropout_(input, p, training) if inplace else _VF.dropout(input, p, training)
RuntimeError: "bernoulli_scalar_cpu_" not implemented for 'Half'
  0%|          | 0/3000 [00:00<?, ?it/s]

Expected Behavior

No response

Steps To Reproduce

将ADGEN数据集文件夹放入ptuning文件夹
在ptuning文件夹运行bash trains.sh
出现错误

Environment

- OS: windows11
- Python:3.10
- Transformers: 4.27.1
- PyTorch: 2.2.1+cu121
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) : True

Anything else?

No response

@ysqfirmament
Copy link
Author

是不是我的电脑跑不动?

@Zylsjsp
Copy link

Zylsjsp commented May 8, 2024

是不是我的电脑跑不动?

我觉得你应该先讲一下你显卡的型号显存 同时查一下自己的显卡是不是支持模型量化(我记得在根目录的readme有提示)

默认配置是量化到int4的 显存需求很低 而且你提示也不是oom 应该可以排除爆显存的可能(至少这一步报错的时候还不是)

我有个建议是你去把量化的参数改成fp16的(直接删掉也行) 不量化模型只是显存占用大些 速度能快好多 一是因为加载过程不用量化 二是fp16训练推理最快(我的测试中训练时间fp16<<int4<int8)

顺便一提 我的配置是4张tesla t4 16g显存 能跑所有p-tuning但是全量微调会爆显存 软件版本是

- Python:3.9.19
- Transformers: 4.27.1
- PyTorch: 1.3.1+cu116
- CUDA: 11.6

因为服务器没办法更新 另一个微调的环境需要transformers>=4.30 我还花了很久解决依赖问题依赖地狱 所以对依赖版本印象特别深

实在不行你可以试试和我的配置保持一致 管他那么多先跑通再说

顺便我是Linux跑的 要不你也试试找个服务器

@Zylsjsp
Copy link

Zylsjsp commented May 8, 2024

看看你用的代码是不是最新的 这个报错应该是说有个标量不能用半精度实现 如果最新的代码还是报同样的错误 你可以试试把报错的代码中half()这种半精度量化的过程修改去除
如果你修改了代码 需要的显存大概会提升 而且量化到int的操作可能也会跟着变化 所以不推荐改你理解做什么的代码 也不推荐修改代码之后再进行int量化了 #462

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants