Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: causal_conv1d with channel last layout requires strides (x.stride(0) and x.stride(2)) to be multiples of 8 #362

Closed
wants to merge 3 commits into from

Conversation

Dexterp37
Copy link

@Dexterp37 Dexterp37 commented Jun 5, 2024

This addresses the following problem:

RuntimeError: causal_conv1d with channel last layout requires strides (x.stride(0) and x.stride(2)) to be multiples of 8

I'm not sure if this is the correct fix, but I'm quite sure I can't run without these changes.

This addresses the following problem:

```
RuntimeError: causal_conv1d with channel last layout requires strides (x.stride(0) and x.stride(2)) to be multiples of 8
```
This covers the "slow path" as well.
@catalpaaa
Copy link

catalpaaa commented Jun 9, 2024

you should add contiguous() at line 880 882 and 883 too for training to run

dxBC_given = rearrange(dxBC_given, "b s d -> b d s").contiguous()
dxBC_given, dweight, dbias, *_ = causal_conv1d_cuda.causal_conv1d_bwd(
    rearrange(xBC, "b s d -> b d s").contiguous(), conv1d_weight, conv1d_bias,
    rearrange(dxBC, "b s d -> b d s").contiguous(), seq_idx, None, None, dxBC_given, False, ctx.activation in ["silu", "swish"]
)

(sorry if i got the line numbers wrong)

@catalpaaa catalpaaa mentioned this pull request Jun 12, 2024
@Dexterp37
Copy link
Author

you should add contiguous() at line 880 882 and 883 too for training to run

Whoops, you're indeed right! Thanks @catalpaaa , I updated the PR .

@Dexterp37
Copy link
Author

@tridao , is this the expected way to address the mentioned problem? I believe (but may be wrong!) that calling contiguous might have negative performance implications.

@catalpaaa
Copy link

@tridao , is this the expected way to address the mentioned problem? I believe (but may be wrong!) that calling contiguous might have negative performance implications.

#351 (comment) try this

@Dexterp37
Copy link
Author

@tridao , is this the expected way to address the mentioned problem? I believe (but may be wrong!) that calling contiguous might have negative performance implications.

#351 (comment) try this

You're indeed right!

@Dexterp37 Dexterp37 closed this Jun 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants