Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for torch.compile dynamic shapes #30560

Merged
merged 4 commits into from May 20, 2024

Conversation

warner-benjamin
Copy link
Contributor

This PR adds support for compiling models with dynamic shapes dynamic=True to almost all models with SDPAttention implementations which currently do not support dynamic shapes. #30442 added support for Llama, Gemma, OLMo, & Cohere.

The only model not modified is DBRX, which needs the changes from both #30070 and #30442 to add support for SDPA's Flash Attention kernel and support for dynamic shapes, as it I believe it suffers from the same training memory issues detailed in #30010.

As mentioned in #30442, moving the is_causal dispatch logic from inline to an if statement is required to support both fullgraph=True and dynamic=True.

I kept the qlen>1 comments but could remove them if we want to match Llama, which doesn't have it.

cc @ArthurZucker and @fxmarty

Copy link
Collaborator

@fxmarty fxmarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this! Just left a minor comment but it looks good!

(note that afaik Transformers does not enforce line violation

ignore = ["C901", "E501", "E741", "F402", "F823" ]
)

Could you add a test for dynamic=True?

src/transformers/models/bart/modeling_bart.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM @fxmarty feel free to merge when CIs are green if it's fine with you!

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@warner-benjamin
Copy link
Contributor Author

warner-benjamin commented May 7, 2024

@fxmarty @ArthurZucker I didn't touch Llava. The current "Tensor-likes are not close" error shouldn't have anything to do with this PR. It should be ready to go from my end.

@ArthurZucker
Copy link
Collaborator

no worries, rebasing on main should most probably fix this!

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merging as this is quite important. Let's keep an eye on the slow tests that will be triggered!

@ArthurZucker ArthurZucker merged commit cd6bd0a into huggingface:main May 20, 2024
22 checks passed
@warner-benjamin warner-benjamin deleted the compile_dynamic branch May 20, 2024 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants