-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support L2 cache hint #3470
base: main
Are you sure you want to change the base?
Support L2 cache hint #3470
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make sure there is a test exercising this
// Create cache policy here. Consider moving it to function scope. | ||
// createpolicy.fractional.L2::policy.b64 xx 1.0 | ||
PTXBuilder ptxBuilder; | ||
const bool hasL2EvictPolicy = | ||
op.getEvict() == triton::EvictionPolicy::EVICT_FIRST || | ||
op.getEvict() == triton::EvictionPolicy::EVICT_LAST; | ||
Value policyRet; | ||
if (hasL2EvictPolicy) { | ||
auto &policy = | ||
ptxBuilder.create<>("createpolicy.fractional") | ||
->o("L2::evict_first", | ||
op.getEvict() == triton::EvictionPolicy::EVICT_FIRST) | ||
.o("L2::evict_last", | ||
op.getEvict() == triton::EvictionPolicy::EVICT_LAST) | ||
.b(64); | ||
const std::string writeConstraint = "=l"; | ||
// prepare asm operands | ||
auto *dstOpr = ptxBuilder.newOperand(writeConstraint, | ||
/*init=*/true); | ||
std::string fractionStr = "1.0"; | ||
auto *fractionOpr = ptxBuilder.newConstantOperand(fractionStr); | ||
policy(dstOpr, fractionOpr); | ||
Type policyRetTy = rewriter.getI64Type(); | ||
policyRet = ptxBuilder.launch(rewriter, loc, policyRetTy); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like the this is the same exact code for load and store. Could you move it in a common helper?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick review! I will add test cases and refactor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you still working on the refactoring? Let me know when it's ready to review or you can move it into a draft until then
// CHECK: ld.global.L1::evict_last.L2::cache_hint.v4.b32 | ||
// CHECK: st.global.L1::evict_last.L2::cache_hint.v4.b32 | ||
|
||
#blocked = #triton_gpu.blocked<{sizePerThread = [4], threadsPerWarp = [32], warpsPerCTA = [4], order = [0]}> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you minimize this test and move it to tritongpu_to_llvm.mlir?
// Create cache policy here. Consider moving it to function scope. | ||
// createpolicy.fractional.L2::policy.b64 xx 1.0 | ||
PTXBuilder ptxBuilder; | ||
const bool hasL2EvictPolicy = | ||
op.getEvict() == triton::EvictionPolicy::EVICT_FIRST || | ||
op.getEvict() == triton::EvictionPolicy::EVICT_LAST; | ||
Value policyRet; | ||
if (hasL2EvictPolicy) { | ||
auto &policy = | ||
ptxBuilder.create<>("createpolicy.fractional") | ||
->o("L2::evict_first", | ||
op.getEvict() == triton::EvictionPolicy::EVICT_FIRST) | ||
.o("L2::evict_last", | ||
op.getEvict() == triton::EvictionPolicy::EVICT_LAST) | ||
.b(64); | ||
const std::string writeConstraint = "=l"; | ||
// prepare asm operands | ||
auto *dstOpr = ptxBuilder.newOperand(writeConstraint, | ||
/*init=*/true); | ||
std::string fractionStr = "1.0"; | ||
auto *fractionOpr = ptxBuilder.newConstantOperand(fractionStr); | ||
policy(dstOpr, fractionOpr); | ||
Type policyRetTy = rewriter.getI64Type(); | ||
policyRet = ptxBuilder.launch(rewriter, loc, policyRetTy); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you still working on the refactoring? Let me know when it's ready to review or you can move it into a draft until then
Sorry, I have forgotten about the refactoring. Will convert to draft and do an internal review first. |
Thanks for working on this. LGTM. I'm also curious about how this moves perf. Maybe kick off a Pytorch nightly perf run? |
What is the easiest way to kick off pytorch perf test? Is it changing the pytorch pin to this hash, then start some job from the diff? @htyu |
Yes. Once you update pytorch pin with this one, push it to a personal branch in pytorch repro and submit a perf run on a branch by using: https://github.com/pytorch/pytorch/actions/workflows/inductor-perf-test-nightly.yml Then see the result ~12 hours later in: |
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
This patch currently creates a cache policy for each load/store with eviction policy. It may be better to create cache policy globally for each function.
Fixes #3438