Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

⚠ Aborting and saving the final best model. Encountered exception: RuntimeError('Invalid argument') RuntimeError: Invalid argument #13468

Closed
Lance-Owen opened this issue Apr 28, 2024 · 1 comment
Labels
feat / ner Feature: Named Entity Recognizer feat / transformer Feature: Transformer gpu Using spaCy on GPU lang / zh Chinese language data and models training Training and updating models

Comments

@Lance-Owen
Copy link

Lance-Owen commented Apr 28, 2024

I had a problem when I used the GPU provided by kaggle to train my Chinese information extraction model, I used the config file generated by the config file generation method of the spacy official website.Your help is greatly appreciated

Some of my environmental information is as follows, if you need to provide others, please leave a message and try your best to provide you

!nvidia-smi

 NVIDIA-SMI 535.129.03             Driver Version: 535.129.03   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================|
|   0  Tesla P100-PCIE-16GB           Off | 00000000:00:04.0 Off |                    0 |
| N/A   34C    P0              26W / 250W |      0MiB / 16384MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|==================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+
!python -V
Python 3.10.13

!python -m spacy info

============================== Info about spaCy ==============================

spaCy version    3.7.4                         
Location         /opt/conda/lib/python3.10/site-packages/spacy
Platform         Linux-5.15.133+-x86_64-with-glibc2.31
Python version   3.10.13                       
Pipelines        zh_core_web_lg (3.7.0), en_core_web_sm (3.7.1), en_core_web_lg (3.7.1)

There are too many python packages for easy display, so we will provide them to you if necessary

Execute the command when an error occurs

!python -m spacy project run all

Misinformation in its entirety

ℹ Running workflow 'all'

================================== convert ==================================
ℹ Skipping 'convert': nothing changed

=================================== train ===================================
Running command: /opt/conda/bin/python -m spacy train configs/config.cfg --output training/bid/ --paths.train corpus/train.spacy --paths.dev corpus/dev.spacy --gpu-id 0
ℹ Saving to output directory: training/bid
ℹ Using GPU: 0

=========================== Initializing pipeline ===========================
[2024-04-28 08:10:01,857] [INFO] Set up nlp object from config
[2024-04-28 08:10:01,902] [INFO] Pipeline: ['transformer', 'ner']
[2024-04-28 08:10:01,909] [INFO] Created vocabulary
[2024-04-28 08:10:01,910] [INFO] Finished initializing nlp object
[2024-04-28 08:10:19,274] [INFO] Initialized pipeline components: ['transformer', 'ner']
✔ Initialized pipeline

============================= Training pipeline =============================
ℹ Pipeline: ['transformer', 'ner']
ℹ Initial learn rate: 0.0
E    #       LOSS TRANS...  LOSS NER  ENTS_F  ENTS_P  ENTS_R  SCORE 
---  ------  -------------  --------  ------  ------  ------  ------
⚠ Aborting and saving the final best model. Encountered exception:
RuntimeError('Invalid argument')
Traceback (most recent call last):
  File "/opt/conda/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/opt/conda/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/opt/conda/lib/python3.10/site-packages/spacy/__main__.py", line 4, in <module>
    setup_cli()
  File "/opt/conda/lib/python3.10/site-packages/spacy/cli/_util.py", line 87, in setup_cli
    command(prog_name=COMMAND)
  File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/typer/core.py", line 783, in main
    return _main(
  File "/opt/conda/lib/python3.10/site-packages/typer/core.py", line 225, in _main
    rv = self.invoke(ctx)
  File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/typer/main.py", line 683, in wrapper
    return callback(**use_params)  # type: ignore
  File "/opt/conda/lib/python3.10/site-packages/spacy/cli/train.py", line 54, in train_cli
    train(config_path, output_path, use_gpu=use_gpu, overrides=overrides)
  File "/opt/conda/lib/python3.10/site-packages/spacy/cli/train.py", line 84, in train
    train_nlp(nlp, output_path, use_gpu=use_gpu, stdout=sys.stdout, stderr=sys.stderr)
  File "/opt/conda/lib/python3.10/site-packages/spacy/training/loop.py", line 135, in train
    raise e
  File "/opt/conda/lib/python3.10/site-packages/spacy/training/loop.py", line 118, in train
    for batch, info, is_best_checkpoint in training_step_iterator:
  File "/opt/conda/lib/python3.10/site-packages/spacy/training/loop.py", line 220, in train_while_improving
    nlp.update(
  File "/opt/conda/lib/python3.10/site-packages/spacy/language.py", line 1193, in update
    proc.update(examples, sgd=None, losses=losses, **component_cfg[name])  # type: ignore
  File "/opt/conda/lib/python3.10/site-packages/spacy_transformers/pipeline_component.py", line 294, in update
    trf_full, bp_trf_full = self.model.begin_update(docs)
  File "/opt/conda/lib/python3.10/site-packages/thinc/model.py", line 328, in begin_update
    return self._func(self, X, is_train=True)
  File "/opt/conda/lib/python3.10/site-packages/spacy_transformers/layers/transformer_model.py", line 199, in forward
    model_output, bp_tensors = transformer(wordpieces, is_train)
  File "/opt/conda/lib/python3.10/site-packages/thinc/model.py", line 310, in __call__
    return self._func(self, X, is_train=is_train)
  File "/opt/conda/lib/python3.10/site-packages/thinc/layers/pytorchwrapper.py", line 225, in forward
    Ytorch, torch_backprop = model.shims[0](Xtorch, is_train)
  File "/opt/conda/lib/python3.10/site-packages/thinc/shims/pytorch.py", line 95, in __call__
    return self.begin_update(inputs)
  File "/opt/conda/lib/python3.10/site-packages/thinc/shims/pytorch.py", line 129, in begin_update
    output = self._model(*inputs.args, **inputs.kwargs)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/transformers/models/bert/modeling_bert.py", line 1013, in forward
    encoder_outputs = self.encoder(
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/transformers/models/bert/modeling_bert.py", line 607, in forward
    layer_outputs = layer_module(
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/transformers/models/bert/modeling_bert.py", line 497, in forward
    self_attention_outputs = self.attention(
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/transformers/models/bert/modeling_bert.py", line 427, in forward
    self_outputs = self.self(
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/lib/python3.10/site-packages/transformers/models/bert/modeling_bert.py", line 325, in forward
    attention_scores = torch.matmul(query_layer, key_layer.transpose(-1, -2))
RuntimeError: Invalid argument
@svlandeg
Copy link
Member

Hi! Let me transfer this thread to the discussion forum, as we like to keep the issue tracker focused on bug reports.

@svlandeg svlandeg added lang / zh Chinese language data and models training Training and updating models gpu Using spaCy on GPU feat / ner Feature: Named Entity Recognizer feat / transformer Feature: Transformer labels May 15, 2024
@explosion explosion locked and limited conversation to collaborators May 15, 2024
@svlandeg svlandeg converted this issue into discussion #13495 May 15, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
feat / ner Feature: Named Entity Recognizer feat / transformer Feature: Transformer gpu Using spaCy on GPU lang / zh Chinese language data and models training Training and updating models
Projects
None yet
Development

No branches or pull requests

2 participants