New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dynamo] Turn on guard_nn_modules #125202

Closed

anijain2305 wants to merge 25 commits into gh/anijain2305/306/base from gh/anijain2305/306/head

Contributor

anijain2305 commented Apr 29, 2024 •

edited

Stack from ghstack (oldest at bottom):

Turning on guard_nn_modules adds large number of guards, so we are bound to take a perf hit. But the perf hit is small. These are the numbers

First we observe that compared to Python guards, C++ guards give around 6x speedup. This reduces the total time spent in guards. This is shown in the last column (cpp_guards/inductor_optimized_latency). The worst model is around 1.61%, with most of the models below 1%. I think this is good enough signal to turn the config on.

One might also wonder how much guard slowdown occurs with guard_nn_modules=True. This is the table

For most models, the guard overhead with nn module guards is under 2x. There are a few outliers, where the slowdown is really high and for those models we spend 1%-2% time in C++ guards as shown in first table.

cc @ezyang @msaroufim @bdhirsh @chauhang @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @kadeng


[DONT MERGE][FOR CI][dynamo] Turn on guard_nn_modules

bd72ff3

[ghstack-poisoned]

anijain2305 mentioned this pull request

[guards][cpp-guards] Optimize NN module getattr guards #124522

Closed

pytorch-bot bot commented Apr 29, 2024 •

edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/125202

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (2 Unrelated Failures)

As of commit bca250a with merge base ae5e2ab ():

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

inductor / cuda12.1-py3.10-gcc9-sm86 / test (inductor, 1, 1, linux.g5.4xlarge.nvidia.gpu) (gh) (similar failure)
test_modules.py::TestModuleCUDA::test_memory_format_nn_BatchNorm2d_eval_mode_cuda_float64
inductor / rocm6.1-py3.8-inductor / test (inductor, 1, 1, linux.rocm.gpu.2) (gh) (similar failure)
test_modules.py::TestModuleCUDA::test_memory_format_nn_BatchNorm2d_eval_mode_cuda_float64

This comment was automatically generated by Dr. CI and updates every 15 minutes.

This was referenced Apr 29, 2024

[dynamo][source] Remove inspect getattr_static from AttrSource #125200

Closed

[dynamo] Add ID_MATCH guards on inlined functions to force compilation on monkeypatching #124975

Closed

pytorch-bot bot added ciflow/inductor module: dynamo oncall: pt2 labels

anijain2305 added a commit that referenced this pull request


[DONT MERGE][FOR CI][dynamo] Turn on guard_nn_modules

cd8eb71

ghstack-source-id: c9739ecda6b11f88979ecf9b52307439b6b151d0
Pull Request resolved: #125202


Update on "[DONT MERGE][FOR CI][dynamo] Turn on guard_nn_modules"

9ebc9cc

cc ezyang msaroufim bdhirsh chauhang voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng

[ghstack-poisoned]

anijain2305 added a commit that referenced this pull request


[DONT MERGE][FOR CI][dynamo] Turn on guard_nn_modules

e25c423

ghstack-source-id: c9739ecda6b11f88979ecf9b52307439b6b151d0
Pull Request resolved: #125202

anijain2305 added the keep-going label


Update on "[DONT MERGE][FOR CI][dynamo] Turn on guard_nn_modules"

01ca0dc

cc ezyang msaroufim bdhirsh chauhang voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng

[ghstack-poisoned]

anijain2305 added a commit that referenced this pull request


[DONT MERGE][FOR CI][dynamo] Turn on guard_nn_modules

9753d74

ghstack-source-id: 8cef25c560c6cb0b43aa7d45028f0e56ac7cad93
Pull Request resolved: #125202


Update on "[DONT MERGE][FOR CI][dynamo] Turn on guard_nn_modules"

9de7bb3

cc ezyang msaroufim bdhirsh chauhang voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng

[ghstack-poisoned]

anijain2305 mentioned this pull request

[dynamo][guards] Bug fix for set_export_info #125275

Closed

anijain2305 added a commit that referenced this pull request


[DONT MERGE][FOR CI][dynamo] Turn on guard_nn_modules

607f5a8

ghstack-source-id: e55402434e154600fb7f38b4bf6a4163b567aa94
Pull Request resolved: #125202


Update on "[DONT MERGE][FOR CI][dynamo] Turn on guard_nn_modules"

81de74d

cc ezyang msaroufim bdhirsh chauhang voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng

[ghstack-poisoned]

anijain2305 added a commit that referenced this pull request


[DONT MERGE][FOR CI][dynamo] Turn on guard_nn_modules

5d8ec54

ghstack-source-id: b782cd2203016d991c3beb91c836789b3a773e94
Pull Request resolved: #125202

anijain2305 added 3 commits

April 30, 2024 16:26


Update on "[DONT MERGE][FOR CI][dynamo] Turn on guard_nn_modules"

483958a

cc ezyang msaroufim bdhirsh chauhang voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng

[ghstack-poisoned]


Update on "[DONT MERGE][FOR CI][dynamo] Turn on guard_nn_modules"

ef705d1

cc ezyang msaroufim bdhirsh chauhang voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng

[ghstack-poisoned]


Update on "[DONT MERGE][FOR CI][dynamo] Turn on guard_nn_modules"

79918db

cc ezyang msaroufim bdhirsh chauhang voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng

[ghstack-poisoned]

anijain2305 mentioned this pull request

[dynamo][easy] Simple fixes to prepare for nn module guards #125316

Closed

anijain2305 added 3 commits

May 1, 2024 11:11


Update on "[DONT MERGE][FOR CI][dynamo] Turn on guard_nn_modules"

43aa574

cc ezyang msaroufim bdhirsh chauhang voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng

[ghstack-poisoned]


Update on "[DONT MERGE][FOR CI][dynamo] Turn on guard_nn_modules"

cc ezyang msaroufim bdhirsh chauhang voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng

[ghstack-poisoned]


Update on "[DONT MERGE][FOR CI][dynamo] Turn on guard_nn_modules"

3c9ebd7

cc ezyang msaroufim bdhirsh chauhang voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng

[ghstack-poisoned]

anijain2305 mentioned this pull request

[dynamo][prepare for nn module guards] Guard nn modules for a few benchmarks #125324

Closed

anijain2305 added 2 commits

May 2, 2024 00:22


Update on "[DONT MERGE][FOR CI][dynamo] Turn on guard_nn_modules"

6cebb2e

cc ezyang msaroufim bdhirsh chauhang voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng

[ghstack-poisoned]


Update on "[DONT MERGE][FOR CI][dynamo] Turn on guard_nn_modules"

70cf8ed

cc ezyang msaroufim bdhirsh chauhang voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng

[ghstack-poisoned]

anijain2305 mentioned this pull request

[dynamo][nn module] Check for duplicate tensors in register_attr_or_module #125421

Closed


Update on "[DONT MERGE][FOR CI][dynamo] Turn on guard_nn_modules"

ebdb5c9

cc ezyang msaroufim bdhirsh chauhang voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng

[ghstack-poisoned]

This was referenced May 9, 2024

[dynamo][fsdp] Use Tensor match for FSDP modules #125827

Closed

[dynamo] Use correct source for custom getattr #125828

Closed


Update on "[DONT MERGE][FOR CI][dynamo] Turn on guard_nn_modules"

80003cf

cc ezyang msaroufim bdhirsh chauhang voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng

[ghstack-poisoned]

OnlyFor pushed a commit to OnlyFor/pytorch that referenced this pull request


[DONT MERGE][FOR CI][dynamo] Turn on guard_nn_modules

3423dee

ghstack-source-id: b2f84384af12e0c81b22e8ade7883fb2418a4aeb
Pull Request resolved: pytorch#125202

anijain2305 changed the title ~~[DONT MERGE][FOR CI][dynamo] Turn on guard_nn_modules~~ [dynamo] Turn on guard_nn_modules

anijain2305 requested review from jansel and ezyang

May 10, 2024 16:32

ezyang reviewed

View reviewed changes

Contributor

ezyang left a comment

You want a JK for internal rollout to killswitch it

Contributor

ezyang commented May 10, 2024

🤔 which we don't have precedent for in this module

Contributor

ezyang commented May 10, 2024

You want to call torch._utils_internal.justknobs_check to read out the default, but you can't do the obvious thing of doing it in the config module as that will cause the JK check to happen at module import time and @oulgen and I know that you can't do that because it will poison the process for forks. So you want this to happen on the first time the config is accessed. The low tech way is to default this to None, and then at the read site, if it is None, pong the JK for the real value. The high tech way is to add some capability to the config module getter (I think we've got a accessor function where you can customize) so that it is able to lazily query JK.

Contributor

ezyang commented May 10, 2024

Another deployment strategy that doesn't involve JKs is to turn it on in OSS only but not fbcode, then enable it on a per PG basis by twiddling it, and then later switch the fbcode default to true.


Update on "[dynamo] Turn on guard_nn_modules"

bca250a

Turning on guard_nn_modules adds large number of guards, so we are bound to take a perf hit. But the perf hit is small. These are the numbers 

![image](https://github.com/pytorch/pytorch/assets/13822661/c8793906-c8c7-432b-9af4-4594713067be)

First we observe that compared to Python guards, C++ guards give around 6x speedup. This reduces the total time spent in guards. This is shown in the last column (cpp_guards/inductor_optimized_latency). The worst model is around 1.61%, with most of the models below 1%. I think this is good enough signal to turn the config on.

cc ezyang msaroufim bdhirsh chauhang voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng

[ghstack-poisoned]

Contributor Author

anijain2305 commented May 10, 2024

Going with fbcode off and OSS on for now. Will figure out internal rollout strategy.

ezyang approved these changes

View reviewed changes

Contributor Author

anijain2305 commented May 11, 2024

@pytorchbot merge

pytorch-bot bot added the ciflow/trunk label

pytorchmergebot added the merging label

Collaborator

pytorchmergebot commented May 11, 2024

Merge failed

Reason: This PR needs a release notes: label
If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Details for Dev Infra team

Raised by workflow job

pytorchmergebot removed the merging label

anijain2305 added the topic: not user facing label

Contributor Author

anijain2305 commented May 11, 2024

@pytorchbot merge

pytorchmergebot added the merging label

Collaborator

pytorchmergebot commented May 11, 2024

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot added the Merged label

pytorchmergebot closed this in

0935b3d

pytorchmergebot removed the merging label

tinglvv pushed a commit to tinglvv/pytorch that referenced this pull request


[dynamo] Turn on guard_nn_modules (pytorch#125202)

462cef4

Turning on guard_nn_modules adds large number of guards, so we are bound to take a perf hit. But the perf hit is small. These are the numbers

![image](https://github.com/pytorch/pytorch/assets/13822661/c8793906-c8c7-432b-9af4-4594713067be)

First we observe that compared to Python guards, C++ guards give around 6x speedup. This reduces the total time spent in guards. This is shown in the last column (cpp_guards/inductor_optimized_latency). The worst model is around 1.61%, with most of the models below 1%. I think this is good enough signal to turn the config on.

One might also wonder how much guard slowdown occurs with `guard_nn_modules=True`. This is the table
![image](https://github.com/pytorch/pytorch/assets/13822661/932a885b-1c03-424b-8405-5bc8fd35dd39)

For most models, the guard overhead with nn module guards is under 2x. There are a few outliers, where the slowdown is really high and for those models we spend 1%-2% time in C++ guards as shown in first table.

Pull Request resolved: pytorch#125202
Approved by: https://github.com/ezyang

chuanqi129 mentioned this pull request

[inductor][cpu]speech_transformer AMP single/multiple thread static/dynamic shape CPP/default wrapper performance regression in 2024-05-12 nightly release #126274

Open

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment