Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RVV target test failures #2078

Closed
lsrcz opened this issue Apr 10, 2024 · 9 comments · Fixed by #2218
Closed

RVV target test failures #2078

lsrcz opened this issue Apr 10, 2024 · 9 comments · Fixed by #2218

Comments

@lsrcz
Copy link
Contributor

lsrcz commented Apr 10, 2024

Hello, I compiled the HEAD version of highway (commit: 3cb5c1a) with a recent snapshot of clang-19 (commit: 44af53b), and I got failed tests.

The tests were run on qemu-riscv, and the following is the test outputs:

$ ctest --rerun-failed --output-on-failure -v                                                                 ─╯
Test project /home/siruilu/highway/build
    Start 565: MatVecTestGroup/MatVecTest.TestAllMatVec/RVV  # GetParam() = 137438953472
1/2 Test #565: MatVecTestGroup/MatVecTest.TestAllMatVec/RVV  # GetParam() = 137438953472 ...***Failed    0.21 sec
Running main() from /home/siruilu/highway/build/googletest-src/googletest/src/gtest_main.cc
Note: Google Test filter = MatVecTestGroup/MatVecTest.TestAllMatVec/RVV
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from MatVecTestGroup/MatVecTest
[ RUN      ] MatVecTestGroup/MatVecTest.TestAllMatVec/RVV
f64/f64 6 x 8, with add: mismatch at 0 0.000000 0.000000; tol 0.000000
Abort at matvec_test.cc:178: Assert 0

    Start 643: SortTestGroup/SortTest.TestAllPartition/RVV  # GetParam() = 137438953472
2/2 Test #643: SortTestGroup/SortTest.TestAllPartition/RVV  # GetParam() = 137438953472 ....***Failed    0.04 sec
Running main() from /home/siruilu/highway/build/googletest-src/googletest/src/gtest_main.cc
Note: Google Test filter = SortTestGroup/SortTest.TestAllPartition/RVV
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from SortTestGroup/SortTest
[ RUN      ] SortTestGroup/SortTest.TestAllPartition/RVV
Abort at sort_test.cc:342: U128: asc 1 left[0] piv 0 2 compares before 2 1 border 8


0% tests passed, 2 tests failed out of 2

Total Test time (real) =   0.25 sec

The following tests FAILED:
        565 - MatVecTestGroup/MatVecTest.TestAllMatVec/RVV  # GetParam() = 137438953472 (Failed)
        643 - SortTestGroup/SortTest.TestAllPartition/RVV  # GetParam() = 137438953472 (Failed)
Errors while running CTest
@jan-wassenberg
Copy link
Member

Thanks for letting us know. Unfortunately RVV tests in our CI are currently disabled because the toolchain is crashing. We've filed an LLVM bug:)

@jan-wassenberg
Copy link
Member

Meanwhile, LLVM has rolled back the patch that caused the compiler crash. Our RVV tests are again usable. Would you like to file an LLVM issue for the test failure on clang-19?

@lsrcz
Copy link
Contributor Author

lsrcz commented May 28, 2024

Hi @jan-wassenberg , I have just rerun the tests with clang-17.0.6 and a recent version of clang (2ace7bd), and they both failed the SortTestGroup/SortTest.TestAllPartition/RVV test.

The MatVecTestGroup/MatVecTest.TestAllMatVec/RVV is no longer failing, so I believe that the bug causing it may have been fixed.

@jan-wassenberg
Copy link
Member

Interesting, thanks for checking. Our toolchain doesn't come with an exact version, but it is relatively close to LLVM HEAD. It seems to succeed with -march=rv64gcv1p0. What build flags are you using?

@lsrcz
Copy link
Contributor Author

lsrcz commented May 29, 2024

I am using the default flags, and I believe that it is -march=rv64gcv1p0 after checking the generated build.ninja.

However, I found that the test only fails when VLEN is 128. What VLEN are you using for your CI?

@jan-wassenberg
Copy link
Member

Oh, good catch! That's likely it. We seem to have 512-bit VLEN. Note that SortTag uses LMUL=1/2.

The problem is that the base case is meant to handle at least two vectors (and Partition relies upon that), but the base case is also VL-dependent and we only handle up to 16 elements. Hopefully we can cap the vector size.

@jan-wassenberg
Copy link
Member

The sort itself does check for the problem, but TestAllPartition did not, and soon will.
Unfortunately our CI doesn't work with 1024, so not entirely certain this fixes it.

@lsrcz
Copy link
Contributor Author

lsrcz commented May 30, 2024

Thanks. I tried the fa61572 commit, and it seems to work under VLEN=128/1024 on qemu.

@jan-wassenberg
Copy link
Member

Thanks for confirming!

copybara-service bot pushed a commit that referenced this issue May 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants