-
Notifications
You must be signed in to change notification settings - Fork 306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compute special constants. #2830
Comments
This tricks are implemented by LLVM backend (codegen), ISPC can handle it, but preferably it should be done in LLVM. I suggest verifying that LLVM doesn't do that for C/C++ code (using vector extension) and file this in LLVM project - and linking this issue, so we make sure that it happens in ISPC once it's implemented. It's important to note in the LLVM issue, that it's for vector constants - as they would expect that it's for scalar by default. |
constants that are filled with 1s from one side and 0s from another side, such as
0xFFFFFFF8
or0x000000FF
, can be computed directly rather than being broadcasted from memory which should be faster. these numbers are common such as1
,-8
,255
, ...if
-1
is already present in a register, thenvpcmpeqd
is not needed and this will be just one instruction.constants with 1s in the middle can be computed in similar way, perhaps faster than broadcast, should be faster if
-1
is present. also common (2
,4
,-2.0
,0.5
, ...)similar trick can be used to compute constants with 0s in the middle using a shift and a rotate. (AVX512 only)
if a negative number is present,
vpabsd
can be used to get the positive value.duplicate of a number can be computed via
vpaddd
.if
-1
is present, complement of a constant can be computed usingvpxor
.adjacent numbers can be computed by adding or subtracting
-1
.for some numbers,
vpsubd
orvpaddd
can be used instead of double shifts to reduce port contention. for example, to get the number2
, compute(~0 >> 30) + ~0
instead of(~0 >> 31 << 1)
. (provided that -1 is present)caveats:
-1
, which may induce register spilling in certain cases. However, in smaller code sections or where-1
is already available, these methods may be beneficial.The text was updated successfully, but these errors were encountered: