Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intel GPU: specify the tolerance for torchbench models #125213

Closed
wants to merge 1 commit into from

Conversation

weishi-deng
Copy link
Contributor

@weishi-deng weishi-deng commented Apr 30, 2024

We encountered some model accuracy failures as the tolerance is critical. In general, we align with CUDA practice. This PR intends to adjust the tolerance for Torchbench models for training mode on Intel GPU devices and aligns with CUDA.

cc @ezyang @msaroufim @bdhirsh @anijain2305 @chauhang @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @kadeng

Copy link

pytorch-bot bot commented Apr 30, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/125213

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 000d2cf with merge base 8320b77 (image):

FLAKY - The following job failed but was likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@weishi-deng weishi-deng marked this pull request as draft April 30, 2024 07:15
@weishi-deng weishi-deng marked this pull request as ready for review April 30, 2024 07:17
@EikanWang EikanWang added the ciflow/xpu Run XPU CI tasks label Apr 30, 2024
@EikanWang EikanWang added the topic: not user facing topic category label Apr 30, 2024
@EikanWang
Copy link
Collaborator

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label May 1, 2024
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

andoorve pushed a commit to andoorve/pytorch that referenced this pull request May 1, 2024
We encountered some model accuracy failures as the tolerance is critical. In general, we align with CUDA practice. This PR intends to adjust the tolerance for Torchbench models for training mode on Intel GPU devices and aligns with CUDA.

Pull Request resolved: pytorch#125213
Approved by: https://github.com/desertfire
andoorve pushed a commit to andoorve/pytorch that referenced this pull request May 1, 2024
We encountered some model accuracy failures as the tolerance is critical. In general, we align with CUDA practice. This PR intends to adjust the tolerance for Torchbench models for training mode on Intel GPU devices and aligns with CUDA.

Pull Request resolved: pytorch#125213
Approved by: https://github.com/desertfire
facebook-github-bot pushed a commit to pytorch/benchmark that referenced this pull request May 2, 2024
Summary:
We encountered some model accuracy failures as the tolerance is critical. In general, we align with CUDA practice. This PR intends to adjust the tolerance for Torchbench models for training mode on Intel GPU devices and aligns with CUDA.

X-link: pytorch/pytorch#125213
Approved by: https://github.com/desertfire

Reviewed By: kit1980

Differential Revision: D56862220

fbshipit-source-id: a773ff0162da3bcac91834876c5ab0335c03ed53
pytorch-bot bot pushed a commit that referenced this pull request May 3, 2024
We encountered some model accuracy failures as the tolerance is critical. In general, we align with CUDA practice. This PR intends to adjust the tolerance for Torchbench models for training mode on Intel GPU devices and aligns with CUDA.

Pull Request resolved: #125213
Approved by: https://github.com/desertfire
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

5 participants