allocation order propagation for matmul/linear #2198

jjsjann123 · 2024-05-03T21:51:46Z

issue was raised by @jacobhinkle

We would like allocation order inference to populate proper allocation domain for inputs to matmul/linear ops.

i.e.

tv0 = fusion.define_tensor(...)
tv1 = fusion.define_tensor(...)
// magic operations that produces `tv0_derived` and `tv1_derived`

tv_out = fusion.ops.matmul(tv0_derived, tv1_derived)
// ...

with a vanilla fusion, tv0_derived and tv1_derived will have an empty allocation domain. This is not ideal, imagining if tv0 and tv1 comes in with a non-trivial allocation_domain.

The ask here is:

we would want allocation order transpose to infer allocation order on tv0_derived and tv1_derived and populate it properly from their producers.
the target of the propagation is recognized simply as inputs to matmul/linear operation. (or other pattern matching that we want to apply).
We do NOT need to populate the allocation_order for tv_out, which should be better done by the scheduler.

The text was updated successfully, but these errors were encountered:

jacobhinkle · 2024-05-05T10:15:41Z

Since scheduler is free to determine some output stride orders, does that mean we cannot really fully propagate it before segmentation? What if this was done during segmentation instead like when we get heuristics we could also query the output allocation domains. If we do that in topo order we could have the proper alloc dom available when computing heuristics/during scheduling.

jjsjann123 · 2024-05-06T22:53:20Z

Since scheduler is free to determine some output stride orders, does that mean we cannot really fully propagate it before segmentation?

The challenge here is to: 1. identify the boundary of each segments before the segmentation happened; 2. known how each segments' IO tensor would be mutated into different memory format by its schedulers.

What if this was done during segmentation instead like when we get heuristics we could also query the output allocation domains. If we do that in topo order we could have the proper alloc dom available when computing heuristics/during scheduling.

IIUC, this is suggesting that each scheduler's canSchedule would also consider updating an empty alloc dom of its output TensorView and properly giving that to the next segment? Yeah that would be a good to have thing as well.
With that said, having a global pass to coordinate across each fusion segments seems reasonable to have.

jjsjann123 · 2024-05-06T22:55:03Z

Question for @jacobhinkle , is the ask above what you were expecting from allocation order inference for now?

The ask here is:

we would want allocation order transpose to infer allocation order on tv0_derived and tv1_derived and populate it properly from their producers.

the target of the propagation is recognized simply as inputs to matmul/linear operation. (or other pattern matching that we want to apply).

We do NOT need to populate the allocation_order for tv_out, which should be better done by the scheduler.

jacobhinkle · 2024-05-07T00:10:52Z

IIUC, this is suggesting that each scheduler's canSchedule would also consider updating an empty alloc dom of its output TensorView and properly giving that to the next segment?

Something like that, yes. For example in #2169 we might want to temporarily disallow matmul segments with a bias whose stride order does not match the output's. At minimum though, we'd want to have this available during proposeHeuristics and SchedulerEntry::makeEntry, which happens after segmentation is done and runtime order is determined. That way we'd be able to reliably infer the layout of matmuls based on input strides.

zasdfgbnm mentioned this issue May 7, 2024

Allocation order refactor #2168

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

allocation order propagation for matmul/linear #2198

allocation order propagation for matmul/linear #2198

jjsjann123 commented May 3, 2024

jacobhinkle commented May 5, 2024 •

edited

jjsjann123 commented May 6, 2024

jjsjann123 commented May 6, 2024

jacobhinkle commented May 7, 2024

allocation order propagation for matmul/linear #2198

allocation order propagation for matmul/linear #2198

Comments

jjsjann123 commented May 3, 2024

jacobhinkle commented May 5, 2024 • edited

jjsjann123 commented May 6, 2024

jjsjann123 commented May 6, 2024

jacobhinkle commented May 7, 2024

jacobhinkle commented May 5, 2024 •

edited