Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement runtime dispatch on riscv64 #838

Closed
malaterre opened this issue Jul 7, 2022 · 39 comments
Closed

Implement runtime dispatch on riscv64 #838

malaterre opened this issue Jul 7, 2022 · 39 comments

Comments

@malaterre
Copy link
Contributor

This is a ticket to track progress on runtime dispatch for riscv64 (followup to issue #818).

For now, unit tests are failing, see Debian/riscv64 buildd:

@jan-wassenberg
Copy link
Member

The CPU/emulator running the test likely doesn't support V extension 1.0 yet, but we are building with flags that allow the compiler to assume that the CPU does support it.

Would anyone like to send a patch to targets.cc (analogous to the AT_HWCAP there), to check whether the V extension is supported?

Independently of that, I can help fix the warnings from your build log, thanks for sharing it.

@malaterre
Copy link
Contributor Author

This may take a bit of back-n-forth as Debian does not offer riscv64 porterbox:

So I kindly requested on debian-riscv mailing list to do it for me:

will update the progress here.

copybara-service bot pushed a commit that referenced this issue Jul 7, 2022
PiperOrigin-RevId: 459513304
@jan-wassenberg
Copy link
Member

Sounds good, thanks!

copybara-service bot pushed a commit that referenced this issue Jul 7, 2022
PiperOrigin-RevId: 459513304
copybara-service bot pushed a commit that referenced this issue Jul 8, 2022
PiperOrigin-RevId: 459513304
copybara-service bot pushed a commit that referenced this issue Jul 8, 2022
PiperOrigin-RevId: 459698464
@malaterre
Copy link
Contributor Author

malaterre commented Jul 14, 2022

I was granted access to the following hardware (*), it seems that the instructions is not supported:

$ ctest -R TestAllMulHigh/RVV -V
[...]
147: Note: Google Test filter = HwyMulTestGroup/HwyMulTest.TestAllMulHigh/RVV
147: [==========] Running 1 test from 1 test suite.
147: [----------] Global test environment set-up.
147: [----------] 1 test from HwyMulTestGroup/HwyMulTest
147: [ RUN      ] HwyMulTestGroup/HwyMulTest.TestAllMulHigh/RVV
1/1 Test #147: HwyMulTestGroup/HwyMulTest.TestAllMulHigh/RVV  # GetParam() = 268435456 ...***Exception: Illegal  0.13 sec

0% tests passed, 1 tests failed out of 1

Total Test time (real) =   0.18 sec

The following tests FAILED:
	147 - HwyMulTestGroup/HwyMulTest.TestAllMulHigh/RVV  # GetParam() = 268435456 (ILLEGAL)
Errors while running CTest
Output from these tests are in: /home/malaterre/highway-0.17.1~git20220711.f0a396a/obj-riscv64-linux-gnu/Testing/Temporary/LastTest.log
Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.

(*)

$ cat /proc/cpuinfo 
processor	: 0
hart		: 2
isa		: rv64imafdc
mmu		: sv39
uarch		: sifive,bullet0

processor	: 1
hart		: 1
isa		: rv64imafdc
mmu		: sv39
uarch		: sifive,bullet0

processor	: 2
hart		: 3
isa		: rv64imafdc
mmu		: sv39
uarch		: sifive,bullet0

processor	: 3
hart		: 4
isa		: rv64imafdc
mmu		: sv39
uarch		: sifive,bullet0

For reference:

$ cat /usr/include/riscv64-linux-gnu/asm/hwcap.h
/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */
/*
 * Copied from arch/arm64/include/asm/hwcap.h
 *
 * Copyright (C) 2012 ARM Ltd.
 * Copyright (C) 2017 SiFive
 */
#ifndef _ASM_RISCV_HWCAP_H
#define _ASM_RISCV_HWCAP_H

/*
 * Linux saves the floating-point registers according to the ISA Linux is
 * executing on, as opposed to the ISA the user program is compiled for.  This
 * is necessary for a handful of esoteric use cases: for example, userspace
 * threading libraries must be able to examine the actual machine state in
 * order to fully reconstruct the state of a thread.
 */
#define COMPAT_HWCAP_ISA_I	(1 << ('I' - 'A'))
#define COMPAT_HWCAP_ISA_M	(1 << ('M' - 'A'))
#define COMPAT_HWCAP_ISA_A	(1 << ('A' - 'A'))
#define COMPAT_HWCAP_ISA_F	(1 << ('F' - 'A'))
#define COMPAT_HWCAP_ISA_D	(1 << ('D' - 'A'))
#define COMPAT_HWCAP_ISA_C	(1 << ('C' - 'A'))

#endif /* _ASM_RISCV_HWCAP_H */

@jan-wassenberg
Copy link
Member

That's right, we'd have to see v in the list of isa extensions. I see mention of Sifive bullet from 2020, whereas the V spec was ratified only a few months ago. We're testing using Spike or QEMU.

@malaterre
Copy link
Contributor Author

Just for reference, current debian patch is:

@jan-wassenberg
Copy link
Member

Nice, I think the bit we want to test is 1 << ('v' - 'a').

@malaterre
Copy link
Contributor Author

1 << ('v' - 'a')

Lower-case ? Anyway I'll integrate the patch as-is in next upload.

@malaterre
Copy link
Contributor Author

malaterre commented Jul 18, 2022

@jan-wassenberg could you comment on my patch see above. it seems I am missing something:

[...]
[28/103] /usr/bin/clang++-14 -DHWY_SHARED_DEFINE -Dhwy_contrib_EXPORTS -I"/<<PKGBUILDDIR>>" -g -O2 -ffile-prefix-map=/<<PKGBUILDDIR>>=. -fstack-protector-strong -Wformat -Werror=format-security -DHWY_BROKEN_EMU128=0 -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -fvisibility=hidden -fvisibility-inlines-hidden -Wno-builtin-macro-redefined -D__DATE__=\"redacted\" -D__TIMESTAMP__=\"redacted\" -D__TIME__=\"redacted\" -fmerge-all-constants -Wall -Wextra -Wconversion -Wsign-conversion -Wvla -Wnon-virtual-dtor -Wfloat-overflow-conversion -Wfloat-zero-conversion -Wfor-loop-analysis -Wgnu-redeclared-enum -Winfinite-recursion -Wself-assign -Wstring-conversion -Wtautological-overlap-compare -Wthread-safety-analysis -Wundefined-func-template -fno-cxx-exceptions -fno-slp-vectorize -fno-vectorize -fdiagnostics-show-option -fcolor-diagnostics -Wc++2a-extensions -fmath-errno -fno-exceptions -march=rv64gcv1p0 -menable-experimental-extensions -MD -MT CMakeFiles/hwy_contrib.dir/hwy/contrib/image/image.cc.o -MF CMakeFiles/hwy_contrib.dir/hwy/contrib/image/image.cc.o.d -o CMakeFiles/hwy_contrib.dir/hwy/contrib/image/image.cc.o -c '/<<PKGBUILDDIR>>/hwy/contrib/image/image.cc'
[...]

thje above seems to puts v-extensions in shared logic:

@jan-wassenberg
Copy link
Member

Yes indeed, you could also write 'V' - 'A' (it is the same).

For the patch, we want to also do what was done for Arm (commit). This is a bit harder to see because the patch also included other required fixes, but the key parts we haven't yet added are:

  1. In detect_targets.h:405
    #define HWY_ATTAINABLE_TARGETS should also be done if ARCH_RVV in addition to ARCH_ARM.

  2. In set_macros-inl.h we want to #define HWY_TARGET_STR to whatever the compiler requires. This is usually xx if the compiler flag were -mxx, but on RVV we previously used -march=rv64gcv1p0 so I'm not sure what the syntax is.

Actually I see in the LLVM headers that RVV currently has the same issue as NEON/SVE: they do not yet support runtime dispatch. An issue has been filed and discussions are ongoing at least for Arm.

It's possible that GCC already supports this for RVV like they do for NEON/SVE. If so, you would see in its riscv_vector.h some target attribute in the function definitions, and that is what we want to define HWY_TARGET_STR to.

@malaterre
Copy link
Contributor Author

It's possible that GCC already supports this for RVV like they do for NEON/SVE. If so, you would see in its riscv_vector.h some target attribute in the function definitions, and that is what we want to define HWY_TARGET_STR to.

I cannot find a file riscv_vector.h in my gcc-12 install tree:

clang seems to offer it:

But as you guessed the include file is messed up:

% head -20 /usr/lib/llvm-14/lib/clang/14.0.6/include/riscv_vector.h | tail -5

#ifndef __riscv_vector
#error "Vector intrinsics require the vector extension."
#endif

@malaterre
Copy link
Contributor Author

@malaterre
Copy link
Contributor Author

Yes indeed, you could also write 'V' - 'A' (it is the same).

Right, silly me...

In any case I do not see this integrated upstream:

where did you come up with the value ? should we report an issue in linux upstream (getauxval may need this definition) ?

@jan-wassenberg
Copy link
Member

Nice, thanks for filing the LLVM issue.
For the AT_* values, I simply extrapolated from the fact that RISC-V extensions are (or were mostly) identified with a one-letter name, and the convention seems to be a dense bit array in alphabetical order.

@jan-wassenberg
Copy link
Member

For HWY_TARGET_STR, this comment suggests that arch=rv64gcv1p0 might be exactly what we want.

@malaterre
Copy link
Contributor Author

@malaterre
Copy link
Contributor Author

rdcycle is currently producing SIGILL on Debian env, so no progress on riscv64 for me until issue is solved:

copybara-service bot pushed a commit that referenced this issue Sep 1, 2022
RDCYCLE used to be in the base ISA but was demoted to an extension which is not yet ratified. Although it will be required for RVA20 profile, at least one board does not support it.

We use the compiler macro for checking whether the extension was passed as an march flag. Because this macro is likely not yet defined, this effectively disables RDCYCLE for now. Revisit after the extension has been ratified.

PiperOrigin-RevId: 471489968
@jan-wassenberg
Copy link
Member

Thanks for making us aware. This situation is regrettable: an important feature (cycle counter or even timer) has been demoted from the base spec (where it was when I last checked) to an extension, which is not yet ratified. I fail to see how it makes sense to ship a board without something as basic as a timer (nor how this was considered 'optional' in the spec), but both seem to have happened.

We will disable it for now and fall back to clock_gettime.

copybara-service bot pushed a commit that referenced this issue Sep 1, 2022
RDCYCLE used to be in the base ISA but was demoted to an extension which is not yet ratified. Although it will be required for RVA20 profile, at least one board does not support it.

We use the compiler macro for checking whether the extension was passed as an march flag. Because this macro is likely not yet defined, this effectively disables RDCYCLE for now. Revisit after the extension has been ratified.

PiperOrigin-RevId: 471489968
@malaterre
Copy link
Contributor Author

an important feature (cycle counter or even timer) has been demoted from the base spec (where it was when I last checked) to an extension

wait, what ? SIGILL is produced on the same actual physical board. I fail to understand why an instruction would suddenly fail to execute depending on the running linux kernel.

@jan-wassenberg
Copy link
Member

Here is a good looking example: https://github.com/randombit/botan/blob/master/src/lib/utils/os_utils.cpp#L694

I believe we are not ignoring the signal (SIG_IGN), nor are we restarting the offending instruction, because we'd siglongjmp out of trouble. Does that seem reasonable?

@malaterre
Copy link
Contributor Author

Here is a good looking example: https://github.com/randombit/botan/blob/master/src/lib/utils/os_utils.cpp#L694

Looks pretty impressive indeed ! So that would solve the first issue, the remaining one is that highway + RVV imply compiling all the sources code with the rv64gcv1p0 flag (cmake setting is done on a project basis instead, whereas it should be set using selected pragma on certain source file).

@jan-wassenberg
Copy link
Member

:) Right. I've commented on and subscribed to the issue you filed about that, fingers crossed it will happen soon. Once it does, we can do the SIGILL and then runtime dispatch is ready to go.

fgaz added a commit to fgaz/nixpkgs that referenced this issue Jan 24, 2024
It will not work on processors that lack the V extension until dynamic
dispatch is implemented: google/highway#838
fgaz added a commit to fgaz/nixpkgs that referenced this issue Jan 24, 2024
It will not work on processors that lack the V extension until runtime
dispatch is implemented: google/highway#838
fgaz added a commit to NixOS/nixpkgs that referenced this issue Jan 29, 2024
It will not work on processors that lack the V extension until runtime
dispatch is implemented: google/highway#838
@rwmjones
Copy link

rwmjones commented Feb 19, 2024

What's the recommended way to disable Vector for riscv64 now? cmake -DHWY_CMAKE_RVV=OFF doesn't seem to work and I can see clang is still invoked with -march=rv64gcv1p0. Edit: I see, it needs commit 5d58d23

felixonmars added a commit to felixonmars/archriscv-packages that referenced this issue Mar 8, 2024
Use new upstream proposed flag to turn off RVV:
google/highway#838
felixonmars added a commit to felixonmars/archriscv-packages that referenced this issue Mar 8, 2024
Use new upstream proposed flag to turn off RVV:
google/highway#838
copybara-service bot pushed a commit that referenced this issue Apr 10, 2024
copybara-service bot pushed a commit that referenced this issue Apr 10, 2024
copybara-service bot pushed a commit that referenced this issue Apr 10, 2024
copybara-service bot pushed a commit that referenced this issue Apr 10, 2024
copybara-service bot pushed a commit that referenced this issue Apr 10, 2024
copybara-service bot pushed a commit that referenced this issue Apr 10, 2024
copybara-service bot pushed a commit that referenced this issue Apr 10, 2024
copybara-service bot pushed a commit that referenced this issue Apr 10, 2024
copybara-service bot pushed a commit that referenced this issue Apr 10, 2024
copybara-service bot pushed a commit that referenced this issue Apr 11, 2024
copybara-service bot pushed a commit that referenced this issue Apr 12, 2024
copybara-service bot pushed a commit that referenced this issue Apr 15, 2024
copybara-service bot pushed a commit that referenced this issue Apr 15, 2024
@jan-wassenberg
Copy link
Member

Hooray, this is at long last done :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants