-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
allocation order propagation for matmul/linear #2198
Comments
Since scheduler is free to determine some output stride orders, does that mean we cannot really fully propagate it before segmentation? What if this was done during segmentation instead like when we get heuristics we could also query the output allocation domains. If we do that in topo order we could have the proper alloc dom available when computing heuristics/during scheduling. |
The challenge here is to: 1. identify the boundary of each segments before the segmentation happened; 2. known how each segments' IO tensor would be mutated into different memory format by its schedulers.
IIUC, this is suggesting that each scheduler's |
Question for @jacobhinkle , is the ask above what you were expecting from allocation order inference for now?
|
Something like that, yes. For example in #2169 we might want to temporarily disallow matmul segments with a bias whose stride order does not match the output's. At minimum though, we'd want to have this available during |
issue was raised by @jacobhinkle
We would like allocation order inference to populate proper allocation domain for inputs to matmul/linear ops.
i.e.
with a vanilla fusion, tv0_derived and tv1_derived will have an empty allocation domain. This is not ideal, imagining if tv0 and tv1 comes in with a non-trivial allocation_domain.
The ask here is:
tv0_derived
andtv1_derived
and populate it properly from their producers.matmul
/linear
operation. (or other pattern matching that we want to apply).tv_out
, which should be better done by the scheduler.The text was updated successfully, but these errors were encountered: