arch-riscv: add agnostic option to vector tail/mask policy for mem and arith instructions #1135
+363
−249
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
These two commits add agnostic capability for both tail/mask policies, for vector memory and arithmetic instructions respectively. The common policy for instructions is to act as undisturbed if one is (i.e. tail or mask), or write all 1s if none.
For those instructions in which multiple micro instructions are instantiated to write to the same register (
VlStride
andVlIndex
for memory, andVectorGather
,VectorSlideUp
andVectorSlideDown
for arithmetic), a (new) micro instruction namedVPinVdCpyVsMicroInst
has been used to pin the destination register so that there's no need to copy the partial results between them. This idea is similar to what's on ARM's SVE code. This micro also implements the tail/mask policy for this cases.Finally, it's worth noting that while now using an agnostic policy for both tail/mask should remove all dependencies with old destination registers, there's an exception with
VectorSlideUp
. Thevslideup_{vx,vi}
instructions need the elements in the offset to be unchanged. The current implementation overrides the current vta/vma and makes them act as undisturbed, since they require the old destination register anyways. There's a minor issue with this though, asv{,f}slide1up
variants do not need this, but since they share the same constructor, will act all the same.Related issue #997.