RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0! (when checking argument for argument index in method wrapper_CUDA__index_select) #89

daiyizheng · 2023-08-11T04:28:40Z

我使用llama 65b 在多GPU A100 infer.py推理的时候报了错误，在finetune.py没有问题

daiyizheng · 2023-08-11T04:29:07Z

/slurm/home/yrd/shaolab/daiyizheng/.conda/envs/llama/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: /slurm/home/yrd/shaolab/daiyizheng/.conda/envs/llama did not contain libcudart.so as expected! Searching further paths...
warn(msg)
/slurm/home/yrd/shaolab/daiyizheng/.conda/envs/llama/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/run/user/80104/vscode-git-4b808d81bf.sock')}
warn(msg)
/slurm/home/yrd/shaolab/daiyizheng/.conda/envs/llama/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/run/user/80104/vscode-ipc-0b328df5-e364-494a-b230-9f7e99271b5b.sock')}
warn(msg)
/slurm/home/yrd/shaolab/daiyizheng/.conda/envs/llama/lib/python3.10/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('() { eval /usr/bin/modulecmd bash $*\n}')}
warn(msg)
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization.
The tokenizer class you load from this checkpoint is 'LLaMATokenizer'.
The class this function is called from is 'LlamaTokenizer'.
The model weights are not tied. Please use the tie_weights method before using the infer_auto_device function.

Loading checkpoint shards: 0%| | 0/81 [00:00<?, ?it/s]
Loading checkpoint shards: 1%| | 1/81 [00:00<01:08, 1.16it/s]
Loading checkpoint shards: 2%|▏ | 2/81 [00:01<01:11, 1.11it/s]
Loading checkpoint shards: 4%|▎ | 3/81 [00:02<01:13, 1.07it/s]
Loading checkpoint shards: 5%|▍ | 4/81 [00:03<01:10, 1.09it/s]
Loading checkpoint shards: 6%|▌ | 5/81 [00:04<01:09, 1.09it/s]
Loading checkpoint shards: 7%|▋ | 6/81 [00:05<01:09, 1.08it/s]
Loading checkpoint shards: 9%|▊ | 7/81 [00:06<01:10, 1.05it/s]
Loading checkpoint shards: 10%|▉ | 8/81 [00:07<01:07, 1.08it/s]
Loading checkpoint shards: 11%|█ | 9/81 [00:08<01:08, 1.05it/s]
Loading checkpoint shards: 12%|█▏ | 10/81 [00:09<01:06, 1.07it/s]
Loading checkpoint shards: 14%|█▎ | 11/81 [00:10<01:05, 1.06it/s]
Loading checkpoint shards: 15%|█▍ | 12/81 [00:11<01:04, 1.08it/s]
Loading checkpoint shards: 16%|█▌ | 13/81 [00:12<01:02, 1.08it/s]
Loading checkpoint shards: 17%|█▋ | 14/81 [00:13<01:02, 1.07it/s]
Loading checkpoint shards: 19%|█▊ | 15/81 [00:14<01:03, 1.05it/s]
Loading checkpoint shards: 20%|█▉ | 16/81 [00:15<01:02, 1.04it/s]
Loading checkpoint shards: 21%|██ | 17/81 [00:16<01:02, 1.02it/s]
Loading checkpoint shards: 22%|██▏ | 18/81 [00:16<01:01, 1.03it/s]
Loading checkpoint shards: 23%|██▎ | 19/81 [00:17<01:00, 1.03it/s]
Loading checkpoint shards: 25%|██▍ | 20/81 [00:18<00:59, 1.02it/s]
Loading checkpoint shards: 26%|██▌ | 21/81 [00:19<00:58, 1.02it/s]
Loading checkpoint shards: 27%|██▋ | 22/81 [00:20<00:57, 1.02it/s]
Loading checkpoint shards: 28%|██▊ | 23/81 [00:21<00:57, 1.01it/s]
Loading checkpoint shards: 30%|██▉ | 24/81 [00:22<00:55, 1.04it/s]
Loading checkpoint shards: 31%|███ | 25/81 [00:23<00:52, 1.07it/s]
Loading checkpoint shards: 32%|███▏ | 26/81 [00:24<00:51, 1.07it/s]
Loading checkpoint shards: 33%|███▎ | 27/81 [00:25<00:50, 1.07it/s]
Loading checkpoint shards: 35%|███▍ | 28/81 [00:26<00:49, 1.06it/s]
Loading checkpoint shards: 36%|███▌ | 29/81 [00:27<00:48, 1.08it/s]
Loading checkpoint shards: 37%|███▋ | 30/81 [00:28<00:47, 1.07it/s]
Loading checkpoint shards: 38%|███▊ | 31/81 [00:29<00:45, 1.09it/s]
Loading checkpoint shards: 40%|███▉ | 32/81 [00:30<00:45, 1.07it/s]
Loading checkpoint shards: 41%|████ | 33/81 [00:31<00:44, 1.09it/s]
Loading checkpoint shards: 42%|████▏ | 34/81 [00:31<00:42, 1.10it/s]
Loading checkpoint shards: 43%|████▎ | 35/81 [00:32<00:42, 1.09it/s]
Loading checkpoint shards: 44%|████▍ | 36/81 [00:33<00:40, 1.10it/s]
Loading checkpoint shards: 46%|████▌ | 37/81 [00:34<00:38, 1.13it/s]
Loading checkpoint shards: 47%|████▋ | 38/81 [00:35<00:38, 1.12it/s]
Loading checkpoint shards: 48%|████▊ | 39/81 [00:36<00:37, 1.12it/s]
Loading checkpoint shards: 49%|████▉ | 40/81 [00:37<00:37, 1.08it/s]
Loading checkpoint shards: 51%|█████ | 41/81 [00:38<00:36, 1.11it/s]
Loading checkpoint shards: 52%|█████▏ | 42/81 [00:39<00:34, 1.11it/s]
Loading checkpoint shards: 53%|█████▎ | 43/81 [00:40<00:34, 1.10it/s]
Loading checkpoint shards: 54%|█████▍ | 44/81 [00:41<00:34, 1.08it/s]
Loading checkpoint shards: 56%|█████▌ | 45/81 [00:42<00:33, 1.08it/s]
Loading checkpoint shards: 57%|█████▋ | 46/81 [00:43<00:33, 1.06it/s]
Loading checkpoint shards: 58%|█████▊ | 47/81 [00:43<00:31, 1.07it/s]
Loading checkpoint shards: 59%|█████▉ | 48/81 [00:44<00:30, 1.09it/s]
Loading checkpoint shards: 60%|██████ | 49/81 [00:45<00:29, 1.08it/s]
Loading checkpoint shards: 62%|██████▏ | 50/81 [00:46<00:29, 1.07it/s]
Loading checkpoint shards: 63%|██████▎ | 51/81 [00:47<00:27, 1.07it/s]
Loading checkpoint shards: 64%|██████▍ | 52/81 [00:48<00:27, 1.05it/s]
Loading checkpoint shards: 65%|██████▌ | 53/81 [00:49<00:27, 1.03it/s]
Loading checkpoint shards: 67%|██████▋ | 54/81 [00:50<00:25, 1.04it/s]
Loading checkpoint shards: 68%|██████▊ | 55/81 [00:51<00:26, 1.00s/it]
Loading checkpoint shards: 69%|██████▉ | 56/81 [00:52<00:24, 1.02it/s]
Loading checkpoint shards: 70%|███████ | 57/81 [00:53<00:23, 1.02it/s]
Loading checkpoint shards: 72%|███████▏ | 58/81 [00:54<00:22, 1.04it/s]
Loading checkpoint shards: 73%|███████▎ | 59/81 [00:55<00:20, 1.06it/s]
Loading checkpoint shards: 74%|███████▍ | 60/81 [00:56<00:19, 1.06it/s]
Loading checkpoint shards: 75%|███████▌ | 61/81 [00:57<00:19, 1.05it/s]
Loading checkpoint shards: 77%|███████▋ | 62/81 [00:58<00:18, 1.05it/s]
Loading checkpoint shards: 78%|███████▊ | 63/81 [00:59<00:17, 1.04it/s]
Loading checkpoint shards: 79%|███████▉ | 64/81 [01:00<00:15, 1.06it/s]
Loading checkpoint shards: 80%|████████ | 65/81 [01:01<00:14, 1.07it/s]
Loading checkpoint shards: 81%|████████▏ | 66/81 [01:01<00:13, 1.08it/s]
Loading checkpoint shards: 83%|████████▎ | 67/81 [01:02<00:12, 1.10it/s]
Loading checkpoint shards: 84%|████████▍ | 68/81 [01:03<00:12, 1.08it/s]
Loading checkpoint shards: 85%|████████▌ | 69/81 [01:04<00:10, 1.10it/s]
Loading checkpoint shards: 86%|████████▋ | 70/81 [01:05<00:10, 1.09it/s]
Loading checkpoint shards: 88%|████████▊ | 71/81 [01:06<00:09, 1.07it/s]
Loading checkpoint shards: 89%|████████▉ | 72/81 [01:07<00:08, 1.09it/s]
Loading checkpoint shards: 90%|█████████ | 73/81 [01:08<00:07, 1.05it/s]
Loading checkpoint shards: 91%|█████████▏| 74/81 [01:09<00:06, 1.07it/s]
Loading checkpoint shards: 93%|█████████▎| 75/81 [01:10<00:05, 1.08it/s]
Loading checkpoint shards: 94%|█████████▍| 76/81 [01:11<00:04, 1.09it/s]
Loading checkpoint shards: 95%|█████████▌| 77/81 [01:12<00:03, 1.08it/s]
Loading checkpoint shards: 96%|█████████▋| 78/81 [01:13<00:02, 1.05it/s]
Loading checkpoint shards: 98%|█████████▊| 79/81 [01:14<00:01, 1.06it/s]
Loading checkpoint shards: 99%|█████████▉| 80/81 [01:15<00:00, 1.06it/s]
Loading checkpoint shards: 100%|██████████| 81/81 [01:15<00:00, 1.21it/s]
Loading checkpoint shards: 100%|██████████| 81/81 [01:15<00:00, 1.07it/s]
Traceback (most recent call last):
File "/slurm/home/yrd/shaolab/daiyizheng/nlp/Huatuo-Llama-Med-Chinese/infer.py", line 132, in
fire.Fire(main)
File "/slurm/home/yrd/shaolab/daiyizheng/.conda/envs/llama/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/slurm/home/yrd/shaolab/daiyizheng/.conda/envs/llama/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/slurm/home/yrd/shaolab/daiyizheng/.conda/envs/llama/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/slurm/home/yrd/shaolab/daiyizheng/nlp/Huatuo-Llama-Med-Chinese/infer.py", line 118, in main
infer_from_json(instruct_dir)
File "/slurm/home/yrd/shaolab/daiyizheng/nlp/Huatuo-Llama-Med-Chinese/infer.py", line 105, in infer_from_json
model_output = evaluate(instruction)
File "/slurm/home/yrd/shaolab/daiyizheng/nlp/Huatuo-Llama-Med-Chinese/infer.py", line 87, in evaluate
generation_output = model.generate(
File "/slurm/home/yrd/shaolab/daiyizheng/.conda/envs/llama/lib/python3.10/site-packages/peft/peft_model.py", line 731, in generate
outputs = self.base_model.generate(**kwargs)
File "/slurm/home/yrd/shaolab/daiyizheng/.conda/envs/llama/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/slurm/home/yrd/shaolab/daiyizheng/.conda/envs/llama/lib/python3.10/site-packages/transformers/generation/utils.py", line 1611, in generate
return self.beam_search(
File "/slurm/home/yrd/shaolab/daiyizheng/.conda/envs/llama/lib/python3.10/site-packages/transformers/generation/utils.py", line 2982, in beam_search
model_kwargs["past_key_values"] = self._reorder_cache(model_kwargs["past_key_values"], beam_idx)
File "/slurm/home/yrd/shaolab/daiyizheng/.conda/envs/llama/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 762, in _reorder_cache
reordered_past += (tuple(past_state.index_select(0, beam_idx) for past_state in layer_past),)
File "/slurm/home/yrd/shaolab/daiyizheng/.conda/envs/llama/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 762, in
reordered_past += (tuple(past_state.index_select(0, beam_idx) for past_state in layer_past),)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0! (when checking argument for argument index in method wrapper_CUDA__index_select)

daiyizheng · 2023-08-11T06:28:13Z

我不规范的解决方案：

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0! (when checking argument for argument index in method wrapper_CUDA__index_select) #89

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0! (when checking argument for argument index in method wrapper_CUDA__index_select) #89

daiyizheng commented Aug 11, 2023

daiyizheng commented Aug 11, 2023

daiyizheng commented Aug 11, 2023 •

edited

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0! (when checking argument for argument index in method wrapper_CUDA__index_select) #89

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0! (when checking argument for argument index in method wrapper_CUDA__index_select) #89

Comments

daiyizheng commented Aug 11, 2023

daiyizheng commented Aug 11, 2023

daiyizheng commented Aug 11, 2023 • edited

daiyizheng commented Aug 11, 2023 •

edited