NameError: name 'index_first_axis' is not defined #746

Stangerine · 2024-05-02T14:02:29Z

Can anyone help me, thanks?

staoxiao · 2024-05-02T14:24:24Z

Can you share the command you used?

Stangerine · 2024-05-02T14:28:05Z

Can you share the command you used?

torchrun --nproc_per_node 1
-m FlagEmbedding.llm_reranker.finetune_for_layerwise.run
--output_dir /opt/data/private/zzq/models/bge-reranker-v2-minicpm-layerwise-finetuned
--model_name_or_path /opt/data/private/zzq/models/bge-reranker-v2-minicpm-layerwise
--train_data /opt/data/private/zzq/dataset/train_data/2.0/finetune_data_for_reranker_2.0.jsonl
--learning_rate 2e-4
--num_train_epochs 1
--per_device_train_batch_size 1
--gradient_accumulation_steps 16
--dataloader_drop_last True
--query_max_len 512
--passage_max_len 8192
--train_group_size 16
--logging_steps 1
--save_steps 2000
--save_total_limit 50
--ddp_find_unused_parameters False
--gradient_checkpointing
--deepspeed /opt/data/private/zzq/train/stage1.json
--warmup_ratio 0.1
--bf16
--use_lora True
--lora_rank 32
--lora_alpha 64
--use_flash_attn True
--target_modules q_proj k_proj v_proj o_proj
--start_layer 8
--head_multi True
--head_type simple
--lora_extra_parameters linear_head
--finetune_type from_finetuned_model

staoxiao · 2024-05-02T14:33:26Z

@545999961, please take a look at this issue when you are convenient.

Stangerine · 2024-05-02T14:46:28Z

@545999961, please take a look at this issue when you are convenient.

Thank you！my friend

545999961 · 2024-05-05T09:41:21Z

Can you provide specific error information? I want to know where the error occurred.

Stangerine · 2024-05-05T10:35:55Z

Can you provide specific error information? I want to know where the error occurred.

545999961 · 2024-05-05T14:00:47Z

can you provide your version of transformers and flash-attn

Stangerine · 2024-05-06T03:10:51Z

can you provide your version of transformers and flash-attn
Thank you, it has been solved. I changed the versions of flash-attn and torch.

Stangerine · 2024-05-06T05:52:17Z

can you provide your version of transformers and flash-attn

warnings.warn( /root/anaconda3/envs/zzq_kdd/lib/python3.9/site-packages/transformers/tokenization_utils_base.py:2692: UserWarning:max_lengthis ignored whenpadding=Trueand there is no truncation strategy. To pad to max length, usepadding='max_length'. warnings.warn( /root/anaconda3/envs/zzq_kdd/lib/python3.9/site-packages/torch/utils/checkpoint.py:460: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
Friends, I would like to ask whether these two warnings have any impact on training reranker?

Stangerine · 2024-05-06T06:08:58Z

accelerate 0.29.1
addict 2.4.0
aiohttp 3.9.3
aiolimiter 1.1.0
aiosignal 1.3.1
aliyun-python-sdk-core 2.15.0
aliyun-python-sdk-kms 2.16.2
annotated-types 0.6.0
antlr4-python3-runtime 4.9.3
anyio 4.3.0
argon2-cffi 23.1.0
argon2-cffi-bindings 21.2.0
asgiref 3.8.1
async-timeout 4.0.3
attrs 23.2.0
auto_gptq 0.7.1
azure-core 1.30.1
azure-storage-blob 12.19.1
backoff 2.2.1
bcrypt 4.1.2
beautifulsoup4 4.12.3
blinker 1.7.0
blis 0.7.11
build 1.2.1
cachetools 5.3.3
catalogue 2.0.10
certifi 2024.2.2
cffi 1.16.0
charset-normalizer 3.3.2
chroma-hnswlib 0.7.3
chromadb 0.4.24
click 8.1.7
cloudpathlib 0.16.0
cohere 5.3.3
coloredlogs 15.0.1
confection 0.1.4
crcmod 1.7
cryptography 42.0.5
cycler 0.11.0
cymem 2.0.8
dataclasses-json 0.6.4
datasets 2.18.0
DBUtils 3.1.0
deepspeed 0.14.2
Deprecated 1.2.14
dill 0.3.8
dirtyjson 1.0.8
distro 1.9.0
docker 6.1.3
docker-compose 1.29.2
dockerpty 0.4.1
docopt 0.6.2
dpr 0.2.1
einops 0.7.0
en-core-web-sm 3.7.1
environs 9.5.0
exceptiongroup 1.2.0
faiss-gpu 1.7.2
fastapi 0.110.2
fastavro 1.9.4
filelock 3.12.2
FlagEmbedding 1.2.9
flash-attention 1.0.0
flash-attn 2.5.7
Flask 3.0.3
flatbuffers 24.3.25
fonttools 4.38.0
frozenlist 1.4.1
fsspec 2024.2.0
gast 0.5.4
gekko 1.1.1
google-auth 2.29.0
googleapis-common-protos 1.63.0
greenlet 3.0.3
grpcio 1.60.0
h11 0.14.0
hjson 3.1.0
httpcore 1.0.5
httptools 0.6.1
httpx 0.27.0
httpx-sse 0.4.0
huggingface-hub 0.22.2
humanfriendly 10.0
hybrid 1.2.3
idna 3.7
importlib-metadata 6.7.0
importlib-resources 5.12.0
isodate 0.6.1
itsdangerous 2.1.2
jieba 0.42.1
Jinja2 3.1.3
jmespath 0.10.0
joblib 1.3.2
JPype1 1.5.0
jsonlines 3.1.0
jsonpatch 1.33
jsonpointer 2.4
jsonschema 3.2.0
kiwisolver 1.4.5
konlpy 0.6.0
kubernetes 29.0.0
langchain 0.1.16
langchain-chroma 0.1.0
langchain-community 0.0.32
langchain-core 0.1.42
langchain-openai 0.1.3
langchain-text-splitters 0.0.1
langcodes 3.3.0
langsmith 0.1.46
llama-index 0.10.29
llama-index-agent-openai 0.2.2
llama-index-cli 0.1.11
llama-index-core 0.10.29
llama-index-embeddings-openai 0.1.7
llama-index-indices-managed-llama-cloud 0.1.5
llama-index-legacy 0.9.48
llama-index-llms-openai 0.1.15
llama-index-multi-modal-llms-openai 0.1.5
llama-index-postprocessor-flag-embedding-reranker 0.1.2
llama-index-program-openai 0.1.5
llama-index-question-gen-openai 0.1.3
llama-index-readers-file 0.1.16
llama-index-readers-llama-parse 0.1.4
llama-parse 0.4.0
llamaindex-py-client 0.1.18
LM-Cocktail 0.0.4
lxml 5.2.1
MarkupSafe 2.1.5
marshmallow 3.21.1
matplotlib 3.5.3
mecab-python3 1.0.8
milvus-model 0.2.0
minio 7.2.5
mmh3 4.1.0
modelscope 1.13.3
monotonic 1.6
mpmath 1.3.0
msgspec 0.18.6
multidict 6.0.5
multiprocess 0.70.16
murmurhash 1.0.10
mypy-extensions 1.0.0
nest-asyncio 1.6.0
networkx 3.2.1
ninja 1.11.1.1
nltk 3.8.1
numpy 1.26.4
nvidia-cublas-cu12 12.1.3.1
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12 8.9.2.26
nvidia-cufft-cu12 11.0.2.54
nvidia-curand-cu12 10.3.2.106
nvidia-cusolver-cu12 11.4.5.107
nvidia-cusparse-cu12 12.1.0.106
nvidia-nccl-cu12 2.19.3
nvidia-nvjitlink-cu12 12.4.127
nvidia-nvtx-cu12 12.1.105
oauthlib 3.2.2
omegaconf 2.3.0
onnxruntime 1.17.3
openai 1.17.0
opentelemetry-api 1.24.0
opentelemetry-exporter-otlp-proto-common 1.24.0
opentelemetry-exporter-otlp-proto-grpc 1.24.0
opentelemetry-instrumentation 0.45b0
opentelemetry-instrumentation-asgi 0.45b0
opentelemetry-instrumentation-fastapi 0.45b0
opentelemetry-proto 1.24.0
opentelemetry-sdk 1.24.0
opentelemetry-semantic-conventions 0.45b0
opentelemetry-util-http 0.45b0
optimum 1.19.1
orjson 3.10.0
oss2 2.18.4
overrides 7.7.0
packaging 24.0
pandas 2.2.1
paramiko 3.4.0
peft 0.8.0
Pillow 9.5.0
pip 24.0
pip-review 1.3.0
pipreqs 0.4.13
platformdirs 4.2.0
posthog 3.5.0
preshed 3.0.9
protobuf 3.20.0
psutil 5.9.8
pulsar-client 3.5.0
py-cpuinfo 9.0.0
pyarrow 15.0.2
pyarrow-hotfix 0.6
pyasn1 0.6.0
pyasn1_modules 0.4.0
pycparser 2.22
pycryptodome 3.20.0
pydantic 1.10.15
pydantic_core 2.16.3
pymilvus 2.4.0
PyMuPDF 1.24.1
PyMuPDFb 1.24.1
PyNaCl 1.5.0
pynvml 11.5.0
pyparsing 3.1.2
pypdf 4.2.0
PyPika 0.48.9
pyproject_hooks 1.0.0
pyrsistent 0.20.0
python-dateutil 2.9.0.post0
python-dotenv 0.21.1
pytz 2024.1
PyYAML 6.0.1
rank-bm25 0.2.2
regex 2023.12.25
requests 2.31.0
requests-oauthlib 2.0.0
reranker 0.2.3
rouge 1.0.1
rsa 4.9
safetensors 0.4.2
scikit-learn 1.4.1.post1
scipy 1.13.0
sentence-transformers 2.6.1
sentencepiece 0.2.0
setuptools 68.2.2
simplejson 3.19.2
six 1.16.0
smart-open 6.4.0
sniffio 1.3.1
sortedcontainers 2.4.0
soupsieve 2.5
spacy 3.7.4
spacy-legacy 3.0.12
spacy-loggers 1.0.5
SQLAlchemy 2.0.29
srsly 2.4.8
starlette 0.37.2
strictjson 4.1.0
striprtf 0.0.26
sympy 1.12
tenacity 8.2.3
texttable 1.7.0
thinc 8.2.3
threadpoolctl 3.4.0
tiktoken 0.6.0
tokenizers 0.19.1
tomli 2.0.1
torch 2.2.2
tornado 6.4
tqdm 4.66.2
transformers 4.40.1
triton 2.2.0
typer 0.9.4
types-requests 2.31.0.20240406
typing_extensions 4.11.0
typing-inspect 0.9.0
tzdata 2024.1
ujson 5.9.0
unidic-lite 1.0.8
urllib3 2.0.7
uvicorn 0.29.0
uvloop 0.19.0
voyageai 0.2.2
wasabi 1.1.2
watchfiles 0.21.0
weasel 0.3.4
websocket-client 0.59.0
websockets 12.0
Werkzeug 3.0.2
wget 3.2
wheel 0.41.2
wrapt 1.16.0
xxhash 3.4.1
yapf 0.40.2
yarg 0.1.9
yarl 1.9.4
zipp 3.15.0

Stangerine · 2024-05-06T07:29:12Z

can you provide your version of transformers and flash-attn

During the process of fine-tuning bge-reranker-v2-minicpm-layerwise, the loss floats around 30. Is this normal? I'm a newbie, can you help me out?

545999961 · 2024-05-06T08:05:48Z

can you provide your version of transformers and flash-attn

warnings.warn( /root/anaconda3/envs/zzq_kdd/lib/python3.9/site-packages/transformers/tokenization_utils_base.py:2692: UserWarning:max_lengthis ignored whenpadding=Trueand there is no truncation strategy. To pad to max length, usepadding='max_length'. warnings.warn( /root/anaconda3/envs/zzq_kdd/lib/python3.9/site-packages/torch/utils/checkpoint.py:460: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants. Friends, I would like to ask whether these two warnings have any impact on training reranker?

No matter, you can continue training.

545999961 · 2024-05-06T08:06:01Z

can you provide your version of transformers and flash-attn

can you provide your version of transformers and flash-attn

During the process of fine-tuning bge-reranker-v2-minicpm-layerwise, the loss floats around 30. Is this normal? I'm a newbie, can you help me out?

This is normal, because the final loss is the accumulation of the loss from each layer.

Stangerine · 2024-05-06T08:12:31Z

can you provide your version of transformers and flash-attn

can you provide your version of transformers and flash-attn

During the process of fine-tuning bge-reranker-v2-minicpm-layerwise, the loss floats around 30. Is this normal? I'm a newbie, can you help me out?

This is normal, because the final loss is the accumulation of the loss from each layer.

Thank you very much!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NameError: name 'index_first_axis' is not defined #746

NameError: name 'index_first_axis' is not defined #746

Stangerine commented May 2, 2024

staoxiao commented May 2, 2024

Stangerine commented May 2, 2024

staoxiao commented May 2, 2024

Stangerine commented May 2, 2024

545999961 commented May 5, 2024

Stangerine commented May 5, 2024

545999961 commented May 5, 2024

Stangerine commented May 6, 2024

Stangerine commented May 6, 2024

Stangerine commented May 6, 2024

Stangerine commented May 6, 2024

545999961 commented May 6, 2024

545999961 commented May 6, 2024

Stangerine commented May 6, 2024

NameError: name 'index_first_axis' is not defined #746

NameError: name 'index_first_axis' is not defined #746

Comments

Stangerine commented May 2, 2024

staoxiao commented May 2, 2024

Stangerine commented May 2, 2024

staoxiao commented May 2, 2024

Stangerine commented May 2, 2024

545999961 commented May 5, 2024

Stangerine commented May 5, 2024

545999961 commented May 5, 2024

Stangerine commented May 6, 2024

Stangerine commented May 6, 2024

Stangerine commented May 6, 2024

Stangerine commented May 6, 2024

545999961 commented May 6, 2024

545999961 commented May 6, 2024

Stangerine commented May 6, 2024