Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance is better in 1.6.1 release compared to 1.7.4 release in many models #419

Open
2 of 4 tasks
vineethanandh opened this issue Sep 22, 2023 · 3 comments
Open
2 of 4 tasks
Labels
bug Something isn't working

Comments

@vineethanandh
Copy link
Contributor

System Info

Optimum-habana - 1.7.4
Synapse AI - 1.12.0
Docker - 1.12.0-463
Gaudi2 (HLS 225) - 1x and 8x.

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Steps to Reproduce
Writing down the steps to reproduce to run SwinT in 1x

  1. Download and install optimum-habana
  2. git clone https://github.com/huggingface/optimum-habana.git
  3. cd optimum-habana
  4. git chekout v1.6-release
  5. pip install -r examples/image-classification/requirements.txt
  6. pip install optimum-habana==1.6.1
  7. python3 /root//optimum-habana/examples/image-classification/run_image_classification.py --model_name_or_path microsoft/swin-base-patch4-window7-224 --dataset_name cifar10 --output_dir /tmp/swint_hf/results/ --remove_unused_columns False --do_train --learning_rate 2e-05 --per_device_train_batch_size 64 --evaluation_strategy no --save_strategy no --load_best_model_at_end True --save_total_limit 3 --seed 1337 --use_habana --use_lazy_mode --gaudi_config_name Habana/swin --throughput_warmup_steps 3 --ignore_mismatched_sizes --bf16 --num_train_epochs 1 --logging_steps 20 --dataloader_num_workers 8

Expected behavior

The expected behaviour is that
1.12.0-463 having similar perf with optimum-habana 1.6.1 and optimum-habana 1.7.4

But what is observed is that perf is better in optimum-habana 1.6.1 and comparitively lesser in 1.7.4

This is applicable for SwinT, ViT, Bert-Large in 8x and 1x.
Eg: values in SwinT is given below

OH - 1.7.4 values
362.524
362.566
360.719
358.089

OH - 1.6.1 values
389.045
390.971
389.587

Almost 7.5% drop

@vineethanandh vineethanandh added the bug Something isn't working label Sep 22, 2023
@vineethanandh vineethanandh changed the title Performance is better in 1.6.1 release compared to 1.7.4 release in may models Performance is better in 1.6.1 release compared to 1.7.4 release in many models Sep 22, 2023
@regisss
Copy link
Collaborator

regisss commented Sep 25, 2023

I'm going to look into it

@vineethanandh
Copy link
Contributor Author

@regisss - Did you get some time to check this behaviour

@regisss
Copy link
Collaborator

regisss commented Oct 9, 2023

@regisss - Did you get some time to check this behaviour

Not yet. I don't think I'll have time to do it this week and before releasing Optimum Habana v1.8. I'll investigate this by next week and will release a patch if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants