Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NameError: name 'index_first_axis' is not defined #746

Open
Stangerine opened this issue May 2, 2024 · 14 comments
Open

NameError: name 'index_first_axis' is not defined #746

Stangerine opened this issue May 2, 2024 · 14 comments

Comments

@Stangerine
Copy link

Can anyone help me, thanks?

@staoxiao
Copy link
Collaborator

staoxiao commented May 2, 2024

Can you share the command you used?

@Stangerine
Copy link
Author

Can you share the command you used?

torchrun --nproc_per_node 1
-m FlagEmbedding.llm_reranker.finetune_for_layerwise.run
--output_dir /opt/data/private/zzq/models/bge-reranker-v2-minicpm-layerwise-finetuned
--model_name_or_path /opt/data/private/zzq/models/bge-reranker-v2-minicpm-layerwise
--train_data /opt/data/private/zzq/dataset/train_data/2.0/finetune_data_for_reranker_2.0.jsonl
--learning_rate 2e-4
--num_train_epochs 1
--per_device_train_batch_size 1
--gradient_accumulation_steps 16
--dataloader_drop_last True
--query_max_len 512
--passage_max_len 8192
--train_group_size 16
--logging_steps 1
--save_steps 2000
--save_total_limit 50
--ddp_find_unused_parameters False
--gradient_checkpointing
--deepspeed /opt/data/private/zzq/train/stage1.json
--warmup_ratio 0.1
--bf16
--use_lora True
--lora_rank 32
--lora_alpha 64
--use_flash_attn True
--target_modules q_proj k_proj v_proj o_proj
--start_layer 8
--head_multi True
--head_type simple
--lora_extra_parameters linear_head
--finetune_type from_finetuned_model

@staoxiao
Copy link
Collaborator

staoxiao commented May 2, 2024

@545999961, please take a look at this issue when you are convenient.

@Stangerine
Copy link
Author

@545999961, please take a look at this issue when you are convenient.

Thank you!my friend

@545999961
Copy link
Collaborator

Can you provide specific error information? I want to know where the error occurred.

@Stangerine
Copy link
Author

Can you provide specific error information? I want to know where the error occurred.

image

@545999961
Copy link
Collaborator

can you provide your version of transformers and flash-attn

@Stangerine
Copy link
Author

can you provide your version of transformers and flash-attn
Thank you, it has been solved. I changed the versions of flash-attn and torch.

@Stangerine
Copy link
Author

can you provide your version of transformers and flash-attn

warnings.warn( /root/anaconda3/envs/zzq_kdd/lib/python3.9/site-packages/transformers/tokenization_utils_base.py:2692: UserWarning:max_lengthis ignored whenpadding=Trueand there is no truncation strategy. To pad to max length, usepadding='max_length'. warnings.warn( /root/anaconda3/envs/zzq_kdd/lib/python3.9/site-packages/torch/utils/checkpoint.py:460: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
Friends, I would like to ask whether these two warnings have any impact on training reranker?

@Stangerine
Copy link
Author

accelerate 0.29.1
addict 2.4.0
aiohttp 3.9.3
aiolimiter 1.1.0
aiosignal 1.3.1
aliyun-python-sdk-core 2.15.0
aliyun-python-sdk-kms 2.16.2
annotated-types 0.6.0
antlr4-python3-runtime 4.9.3
anyio 4.3.0
argon2-cffi 23.1.0
argon2-cffi-bindings 21.2.0
asgiref 3.8.1
async-timeout 4.0.3
attrs 23.2.0
auto_gptq 0.7.1
azure-core 1.30.1
azure-storage-blob 12.19.1
backoff 2.2.1
bcrypt 4.1.2
beautifulsoup4 4.12.3
blinker 1.7.0
blis 0.7.11
build 1.2.1
cachetools 5.3.3
catalogue 2.0.10
certifi 2024.2.2
cffi 1.16.0
charset-normalizer 3.3.2
chroma-hnswlib 0.7.3
chromadb 0.4.24
click 8.1.7
cloudpathlib 0.16.0
cohere 5.3.3
coloredlogs 15.0.1
confection 0.1.4
crcmod 1.7
cryptography 42.0.5
cycler 0.11.0
cymem 2.0.8
dataclasses-json 0.6.4
datasets 2.18.0
DBUtils 3.1.0
deepspeed 0.14.2
Deprecated 1.2.14
dill 0.3.8
dirtyjson 1.0.8
distro 1.9.0
docker 6.1.3
docker-compose 1.29.2
dockerpty 0.4.1
docopt 0.6.2
dpr 0.2.1
einops 0.7.0
en-core-web-sm 3.7.1
environs 9.5.0
exceptiongroup 1.2.0
faiss-gpu 1.7.2
fastapi 0.110.2
fastavro 1.9.4
filelock 3.12.2
FlagEmbedding 1.2.9
flash-attention 1.0.0
flash-attn 2.5.7
Flask 3.0.3
flatbuffers 24.3.25
fonttools 4.38.0
frozenlist 1.4.1
fsspec 2024.2.0
gast 0.5.4
gekko 1.1.1
google-auth 2.29.0
googleapis-common-protos 1.63.0
greenlet 3.0.3
grpcio 1.60.0
h11 0.14.0
hjson 3.1.0
httpcore 1.0.5
httptools 0.6.1
httpx 0.27.0
httpx-sse 0.4.0
huggingface-hub 0.22.2
humanfriendly 10.0
hybrid 1.2.3
idna 3.7
importlib-metadata 6.7.0
importlib-resources 5.12.0
isodate 0.6.1
itsdangerous 2.1.2
jieba 0.42.1
Jinja2 3.1.3
jmespath 0.10.0
joblib 1.3.2
JPype1 1.5.0
jsonlines 3.1.0
jsonpatch 1.33
jsonpointer 2.4
jsonschema 3.2.0
kiwisolver 1.4.5
konlpy 0.6.0
kubernetes 29.0.0
langchain 0.1.16
langchain-chroma 0.1.0
langchain-community 0.0.32
langchain-core 0.1.42
langchain-openai 0.1.3
langchain-text-splitters 0.0.1
langcodes 3.3.0
langsmith 0.1.46
llama-index 0.10.29
llama-index-agent-openai 0.2.2
llama-index-cli 0.1.11
llama-index-core 0.10.29
llama-index-embeddings-openai 0.1.7
llama-index-indices-managed-llama-cloud 0.1.5
llama-index-legacy 0.9.48
llama-index-llms-openai 0.1.15
llama-index-multi-modal-llms-openai 0.1.5
llama-index-postprocessor-flag-embedding-reranker 0.1.2
llama-index-program-openai 0.1.5
llama-index-question-gen-openai 0.1.3
llama-index-readers-file 0.1.16
llama-index-readers-llama-parse 0.1.4
llama-parse 0.4.0
llamaindex-py-client 0.1.18
LM-Cocktail 0.0.4
lxml 5.2.1
MarkupSafe 2.1.5
marshmallow 3.21.1
matplotlib 3.5.3
mecab-python3 1.0.8
milvus-model 0.2.0
minio 7.2.5
mmh3 4.1.0
modelscope 1.13.3
monotonic 1.6
mpmath 1.3.0
msgspec 0.18.6
multidict 6.0.5
multiprocess 0.70.16
murmurhash 1.0.10
mypy-extensions 1.0.0
nest-asyncio 1.6.0
networkx 3.2.1
ninja 1.11.1.1
nltk 3.8.1
numpy 1.26.4
nvidia-cublas-cu12 12.1.3.1
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12 8.9.2.26
nvidia-cufft-cu12 11.0.2.54
nvidia-curand-cu12 10.3.2.106
nvidia-cusolver-cu12 11.4.5.107
nvidia-cusparse-cu12 12.1.0.106
nvidia-nccl-cu12 2.19.3
nvidia-nvjitlink-cu12 12.4.127
nvidia-nvtx-cu12 12.1.105
oauthlib 3.2.2
omegaconf 2.3.0
onnxruntime 1.17.3
openai 1.17.0
opentelemetry-api 1.24.0
opentelemetry-exporter-otlp-proto-common 1.24.0
opentelemetry-exporter-otlp-proto-grpc 1.24.0
opentelemetry-instrumentation 0.45b0
opentelemetry-instrumentation-asgi 0.45b0
opentelemetry-instrumentation-fastapi 0.45b0
opentelemetry-proto 1.24.0
opentelemetry-sdk 1.24.0
opentelemetry-semantic-conventions 0.45b0
opentelemetry-util-http 0.45b0
optimum 1.19.1
orjson 3.10.0
oss2 2.18.4
overrides 7.7.0
packaging 24.0
pandas 2.2.1
paramiko 3.4.0
peft 0.8.0
Pillow 9.5.0
pip 24.0
pip-review 1.3.0
pipreqs 0.4.13
platformdirs 4.2.0
posthog 3.5.0
preshed 3.0.9
protobuf 3.20.0
psutil 5.9.8
pulsar-client 3.5.0
py-cpuinfo 9.0.0
pyarrow 15.0.2
pyarrow-hotfix 0.6
pyasn1 0.6.0
pyasn1_modules 0.4.0
pycparser 2.22
pycryptodome 3.20.0
pydantic 1.10.15
pydantic_core 2.16.3
pymilvus 2.4.0
PyMuPDF 1.24.1
PyMuPDFb 1.24.1
PyNaCl 1.5.0
pynvml 11.5.0
pyparsing 3.1.2
pypdf 4.2.0
PyPika 0.48.9
pyproject_hooks 1.0.0
pyrsistent 0.20.0
python-dateutil 2.9.0.post0
python-dotenv 0.21.1
pytz 2024.1
PyYAML 6.0.1
rank-bm25 0.2.2
regex 2023.12.25
requests 2.31.0
requests-oauthlib 2.0.0
reranker 0.2.3
rouge 1.0.1
rsa 4.9
safetensors 0.4.2
scikit-learn 1.4.1.post1
scipy 1.13.0
sentence-transformers 2.6.1
sentencepiece 0.2.0
setuptools 68.2.2
simplejson 3.19.2
six 1.16.0
smart-open 6.4.0
sniffio 1.3.1
sortedcontainers 2.4.0
soupsieve 2.5
spacy 3.7.4
spacy-legacy 3.0.12
spacy-loggers 1.0.5
SQLAlchemy 2.0.29
srsly 2.4.8
starlette 0.37.2
strictjson 4.1.0
striprtf 0.0.26
sympy 1.12
tenacity 8.2.3
texttable 1.7.0
thinc 8.2.3
threadpoolctl 3.4.0
tiktoken 0.6.0
tokenizers 0.19.1
tomli 2.0.1
torch 2.2.2
tornado 6.4
tqdm 4.66.2
transformers 4.40.1
triton 2.2.0
typer 0.9.4
types-requests 2.31.0.20240406
typing_extensions 4.11.0
typing-inspect 0.9.0
tzdata 2024.1
ujson 5.9.0
unidic-lite 1.0.8
urllib3 2.0.7
uvicorn 0.29.0
uvloop 0.19.0
voyageai 0.2.2
wasabi 1.1.2
watchfiles 0.21.0
weasel 0.3.4
websocket-client 0.59.0
websockets 12.0
Werkzeug 3.0.2
wget 3.2
wheel 0.41.2
wrapt 1.16.0
xxhash 3.4.1
yapf 0.40.2
yarg 0.1.9
yarl 1.9.4
zipp 3.15.0

@Stangerine
Copy link
Author

can you provide your version of transformers and flash-attn

can you provide your version of transformers and flash-attn

During the process of fine-tuning bge-reranker-v2-minicpm-layerwise, the loss floats around 30. Is this normal? I'm a newbie, can you help me out?

@545999961
Copy link
Collaborator

can you provide your version of transformers and flash-attn

warnings.warn( /root/anaconda3/envs/zzq_kdd/lib/python3.9/site-packages/transformers/tokenization_utils_base.py:2692: UserWarning:max_lengthis ignored whenpadding=Trueand there is no truncation strategy. To pad to max length, usepadding='max_length'. warnings.warn( /root/anaconda3/envs/zzq_kdd/lib/python3.9/site-packages/torch/utils/checkpoint.py:460: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants. Friends, I would like to ask whether these two warnings have any impact on training reranker?

No matter, you can continue training.

@545999961
Copy link
Collaborator

can you provide your version of transformers and flash-attn

can you provide your version of transformers and flash-attn

During the process of fine-tuning bge-reranker-v2-minicpm-layerwise, the loss floats around 30. Is this normal? I'm a newbie, can you help me out?

This is normal, because the final loss is the accumulation of the loss from each layer.

@Stangerine
Copy link
Author

can you provide your version of transformers and flash-attn

can you provide your version of transformers and flash-attn

During the process of fine-tuning bge-reranker-v2-minicpm-layerwise, the loss floats around 30. Is this normal? I'm a newbie, can you help me out?

This is normal, because the final loss is the accumulation of the loss from each layer.

Thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants