Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected bus error on Linux/powerpc (32-bit) #614

Open
bhaible opened this issue Jan 23, 2024 · 9 comments
Open

Unexpected bus error on Linux/powerpc (32-bit) #614

bhaible opened this issue Jan 23, 2024 · 9 comments

Comments

@bhaible
Copy link
Contributor

bhaible commented Jan 23, 2024

On Linux/powerpc (32-bit), gc-8.2.4 compiles fine but has two test failures:

FAIL: gctest
FAIL: disclaim_weakmap_test

Here's the log from the VM created through
https://git.savannah.gnu.org/gitweb/?p=gnulib/maint-tools.git;a=blob;f=platforms/environments/qemu/powerpc-linux-debian12.txt :

================================
   gc 8.2.4: ./test-suite.log
================================

# TOTAL: 16
# PASS:  14
# SKIP:  0
# XFAIL: 0
# FAIL:  2
# XPASS: 0
# ERROR: 0

.. contents:: :depth: 2

FAIL: gctest
============

Switched to incremental mode
Emulating dirty bits with mprotect/signals
Unexpected bus error or segmentation fault at 0x5d
Unexpected bus error or segmentation fault
Child process failed, pid= 26669, status= 0x6
Test failed
FAIL gctest (exit status: 134)
Unexpected bus error or segmentation fault at 0x21f
Unexpected bus error or segmentation fault
Unexpected bus error or segmentation fault at 0x21f
Unexpected bus error or segmentation fault
Unexpected bus error or segmentation fault at 0x3f
Unexpected bus error or segmentation fault
Unexpected bus error or segmentation fault at 0x49
Unexpected bus error or segmentation fault

FAIL: disclaim_weakmap_test
===========================

Unexpected bus error or segmentation fault at 0x587d2aa7
Unexpected bus error or segmentation fault
Unexpected bus error or segmentation fault at 0x587d2abf
Unexpected bus error or segmentation fault
FAIL disclaim_weakmap_test (exit status: 134)

Here's the log from the VM created through
https://git.savannah.gnu.org/gitweb/?p=gnulib/maint-tools.git;a=blob;f=platforms/environments/qemu/powerpc-linux-t2sde.txt :

================================
   gc 8.2.4: ./test-suite.log
================================

# TOTAL: 16
# PASS:  14
# SKIP:  0
# XFAIL: 0
# FAIL:  2
# XPASS: 0
# ERROR: 0

.. contents:: :depth: 2

FAIL: gctest
============

Switched to incremental mode
Emulating dirty bits with mprotect/signals
Unexpected bus error or segmentation fault at 0x21f
Unexpected bus error or segmentation fault
Unexpected bus error or segmentation fault at 0x21f
Unexpected bus error or segmentation fault
Unexpected bus error or segmentation fault at 0x21f
Unexpected bus error or segmentation fault
Child process failed, pid= 5294, status= 0x86
Test failed
Aborted (core dumped)
FAIL gctest (exit status: 134)

FAIL: disclaim_weakmap_test
===========================

Unexpected bus error or segmentation fault at 0x5c1bb617
Unexpected bus error or segmentation fault at 0x5c1bb1af
Unexpected bus error or segmentation fault
Aborted (core dumped)
FAIL disclaim_weakmap_test (exit status: 134)

Note: This is unlike the 32-bit mode on Linux/powerpc64 systems that have a POWER CPU, such as cfarm110.cfarm.net. On that system, all tests pass.

@ivmai ivmai changed the title Linux/powerpc support broken Unexpected bus error on Linux/powerpc (32-bit) Jan 23, 2024
@ivmai
Copy link
Owner

ivmai commented Jan 23, 2024

Unexpected bus error or segmentation fault at 0x21f

Looks like a null pointer dereference. Could you please provide a stack trace?

Is this reproducible on release-8_2 or master branch?

Is this reproducible with environment variable GC_DISABLE_INCREMENTAL=1 ?

@bhaible
Copy link
Contributor Author

bhaible commented Jan 23, 2024

Could you please provide a stack trace?

Here are the stack traces.

$ ls -l *.core
-rw------- 1 bruno bruno 68104192 Jan 23 21:54 disclaim_weakma_1706043279_1670.core
-rw------- 1 bruno bruno 43003904 Jan 23 21:49 gctest_1706042998_510.core
-rw------- 1 bruno bruno 43413504 Jan 23 21:49 gctest_1706042998_530.core
-rw------- 1 bruno bruno 43413504 Jan 23 21:49 gctest_1706042998_533.core
$ ./libtool --mode=execute gdb gctest gctest_1706042998_510.core
(gdb) where
#0  0x007a2a20 in __pthread_kill_implementation (threadid=<optimized out>, 
    signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:43
#1  0x007a2aac in __pthread_kill_internal (signo=6, threadid=<optimized out>)
    at pthread_kill.c:78
#2  0x007448b4 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3  0x00729bc8 in __GI_abort () at abort.c:79
#4  0x009c6a60 in run_one_test () at ../tests/test.c:1393
#5  0x009c6b50 in thr_run_one_test (arg=<optimized out>)
    at ../tests/test.c:2356
#6  0x00969208 in GC_inner_start_routine (
    sb=<error reading variable: value has been optimized out>, 
    arg=<error reading variable: value has been optimized out>)
    at ../pthread_start.c:57
#7  0x00958538 in GC_call_with_stack_base (fn=<optimized out>, 
    arg=<optimized out>) at ../extra/../misc.c:2176
#8  0x0095859c in GC_start_routine (arg=<optimized out>)
    at ../extra/../pthread_support.c:2198
#9  0x007a04a4 in start_thread (arg=0xa6437420) at pthread_create.c:444
#10 0x0083d5c4 in clone ()
    at ../sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S:78

Source of frame #4:

      if (GC_size(x) != 8 && GC_size(y) != MIN_WORDS * sizeof(GC_word)) {
        GC_printf("GC_size produced unexpected results\n");
        FAIL;                                                               <== line 1393
      }
$ ./libtool --mode=execute gdb gctest gctest_1706042998_530.core
(gdb) where
#0  0x007a2a20 in __pthread_kill_implementation (threadid=<optimized out>, 
    signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:43
#1  0x007a2aac in __pthread_kill_internal (signo=6, threadid=<optimized out>)
    at pthread_kill.c:78
#2  0x007448b4 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3  0x00729bc8 in __GI_abort () at abort.c:79
#4  0x009543a0 in GC_write_fault_handler (sig=<optimized out>, 
    si=<optimized out>, raw_sc=<optimized out>) at ../extra/../os_dep.c:3416
#5  <signal handler called>
#6  0x0094ba70 in GC_is_marked (p=<optimized out>) at ../extra/../mark.c:196
#7  0x00958b14 in GC_make_disappearing_links_disappear (is_remove_dangling=0, 
    dl_hashtbl=0x9994f0 <GC_arrays+156>) at ../extra/../finalize.c:945
#8  GC_finalize () at ../extra/../finalize.c:993
#9  GC_finish_collection () at ../extra/../alloc.c:1180
#10 0x0095c750 in GC_maybe_gc () at ../extra/../alloc.c:536
#11 GC_collect_a_little_inner (n=n@entry=1) at ../extra/../alloc.c:769
#12 0x00963860 in GC_generic_malloc_many (lb=16, k=k@entry=1, 
    result=result@entry=0xa7c7d604) at ../extra/../mallocx.c:343
#13 0x00963ea8 in GC_malloc_kind (bytes=bytes@entry=12, kind=kind@entry=1)
    at ../extra/../thread_local_alloc.c:187
#14 0x00964a28 in GC_malloc (lb=lb@entry=12) at ../extra/../malloc.c:358
#15 0x009c52e8 in mktree (n=n@entry=1) at ../tests/test.c:915
#16 0x009c533c in mktree (n=n@entry=2) at ../tests/test.c:931
#17 0x009c532c in mktree (n=n@entry=3) at ../tests/test.c:930
#18 0x009c532c in mktree (n=n@entry=4) at ../tests/test.c:930
#19 0x009c532c in mktree (n=n@entry=5) at ../tests/test.c:930
#20 0x009c532c in mktree (n=n@entry=6) at ../tests/test.c:930
#21 0x009c532c in mktree (n=n@entry=7) at ../tests/test.c:930
#22 0x009c532c in mktree (n=n@entry=8) at ../tests/test.c:930
#23 0x009c533c in mktree (n=n@entry=9) at ../tests/test.c:931
#24 0x009c532c in mktree (n=n@entry=10) at ../tests/test.c:930
#25 0x009c533c in mktree (n=n@entry=11) at ../tests/test.c:931
#26 0x009c533c in mktree (n=n@entry=12) at ../tests/test.c:931
#27 0x009c532c in mktree (n=n@entry=13) at ../tests/test.c:930
#28 0x009c532c in mktree (n=n@entry=14) at ../tests/test.c:930
#29 0x009c533c in mktree (n=n@entry=15) at ../tests/test.c:931
#30 0x009c532c in mktree (n=n@entry=16) at ../tests/test.c:930
#31 0x009c5b70 in tree_test () at ../tests/test.c:1153
#32 0x009c6964 in run_one_test () at ../tests/test.c:1594
#33 0x009c6b50 in thr_run_one_test (arg=<optimized out>)
    at ../tests/test.c:2356
#34 0x00969208 in GC_inner_start_routine (
    sb=<error reading variable: value has been optimized out>, 
    arg=<error reading variable: value has been optimized out>)
    at ../pthread_start.c:57
#35 0x00958538 in GC_call_with_stack_base (fn=<optimized out>, 
    arg=<optimized out>) at ../extra/../misc.c:2176
#36 0x0095859c in GC_start_routine (arg=<optimized out>)
    at ../extra/../pthread_support.c:2198
#37 0x007a04a4 in start_thread (arg=0xa6437420) at pthread_create.c:444
#38 0x0083d5c4 in clone ()
    at ../sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S:78

Source of frame #4:

      ABORT_ARG1("Unexpected bus error or segmentation fault",     <== line 3416
                 " at %p", (void *)addr);

Source of frame #6:

GC_API int GC_CALL GC_is_marked(const void *p)
{
    struct hblk *h = HBLKPTR(p);
    hdr * hhdr = HDR(h);
    word bit_no = MARK_BIT_NO((ptr_t)p - (ptr_t)h, hhdr -> hb_sz);

    return (int)mark_bit_from_hdr(hhdr, bit_no); /* 0 or 1 */
}                                                                  <== line 196
$ ./libtool --mode=execute gdb gctest gctest_1706042998_533.core
(gdb) where
#0  0x007a2a20 in __pthread_kill_implementation (threadid=<optimized out>, 
    signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:43
#1  0x007a2aac in __pthread_kill_internal (signo=6, threadid=<optimized out>)
    at pthread_kill.c:78
#2  0x007448b4 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3  0x00729bc8 in __GI_abort () at abort.c:79
#4  0x009543a0 in GC_write_fault_handler (sig=<optimized out>, 
    si=<optimized out>, raw_sc=<optimized out>) at ../extra/../os_dep.c:3416
#5  <signal handler called>
#6  0x0094ba70 in GC_is_marked (p=<optimized out>) at ../extra/../mark.c:196
#7  0x00958b14 in GC_make_disappearing_links_disappear (is_remove_dangling=0, 
    dl_hashtbl=0x9994f0 <GC_arrays+156>) at ../extra/../finalize.c:945
#8  GC_finalize () at ../extra/../finalize.c:993
#9  GC_finish_collection () at ../extra/../alloc.c:1180
#10 0x0095c750 in GC_maybe_gc () at ../extra/../alloc.c:536
#11 GC_collect_a_little_inner (n=n@entry=1) at ../extra/../alloc.c:769
#12 0x00963860 in GC_generic_malloc_many (lb=16, k=k@entry=1, 
    result=result@entry=0xa7c7d364) at ../extra/../mallocx.c:343
#13 0x00963ea8 in GC_malloc_kind (bytes=bytes@entry=12, kind=kind@entry=1)
    at ../extra/../thread_local_alloc.c:187
#14 0x00964a28 in GC_malloc (lb=lb@entry=12) at ../extra/../malloc.c:358
#15 0x009c52e8 in mktree (n=n@entry=0) at ../tests/test.c:915
#16 0x009c532c in mktree (n=n@entry=1) at ../tests/test.c:930
#17 0x009c533c in mktree (n=n@entry=2) at ../tests/test.c:931
#18 0x009c532c in mktree (n=n@entry=3) at ../tests/test.c:930
#19 0x009c533c in mktree (n=n@entry=4) at ../tests/test.c:931
#20 0x009c533c in mktree (n=n@entry=5) at ../tests/test.c:931
#21 0x009c532c in mktree (n=n@entry=6) at ../tests/test.c:930
#22 0x009c533c in mktree (n=n@entry=7) at ../tests/test.c:931
#23 0x009c532c in mktree (n=n@entry=8) at ../tests/test.c:930
#24 0x009c532c in mktree (n=n@entry=9) at ../tests/test.c:930
#25 0x009c532c in mktree (n=n@entry=10) at ../tests/test.c:930
#26 0x009c533c in mktree (n=n@entry=11) at ../tests/test.c:931
#27 0x009c533c in mktree (n=n@entry=12) at ../tests/test.c:931
#28 0x009c532c in mktree (n=n@entry=13) at ../tests/test.c:930
#29 0x009c532c in mktree (n=n@entry=14) at ../tests/test.c:930
#30 0x009c533c in mktree (n=n@entry=15) at ../tests/test.c:931
#31 0x009c532c in mktree (n=n@entry=16) at ../tests/test.c:930
#32 0x009c5b70 in tree_test () at ../tests/test.c:1153
#33 0x009c6964 in run_one_test () at ../tests/test.c:1594
#34 0x009c6b50 in thr_run_one_test (arg=<optimized out>)
    at ../tests/test.c:2356
#35 0x00969208 in GC_inner_start_routine (
    sb=<error reading variable: value has been optimized out>, 
    arg=<error reading variable: value has been optimized out>)
    at ../pthread_start.c:57
#36 0x00958538 in GC_call_with_stack_base (fn=<optimized out>, 
    arg=<optimized out>) at ../extra/../misc.c:2176
#37 0x0095859c in GC_start_routine (arg=<optimized out>)
    at ../extra/../pthread_support.c:2198
#38 0x007a04a4 in start_thread (arg=0xa5c36420) at pthread_create.c:444
#39 0x0083d5c4 in clone ()
    at ../sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S:78

Similar to the previous one.

$ ./libtool --mode=execute gdb disclaim_weakmap_test disclaim_weakma_1706043279_1670.core
(gdb) where
#0  0x00422a20 in __pthread_kill_implementation (threadid=<optimized out>, 
    signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:43
#1  0x00422aac in __pthread_kill_internal (signo=6, threadid=<optimized out>)
    at pthread_kill.c:78
#2  0x003c48b4 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3  0x003a9bc8 in __GI_abort () at abort.c:79
#4  0x005d43a0 in GC_write_fault_handler (sig=<optimized out>, 
    si=<optimized out>, raw_sc=<optimized out>) at ../extra/../os_dep.c:3416
#5  <signal handler called>
#6  __GI_memcmp (s1=s1@entry=0x583c396f, s2=s2@entry=0xa741e748, 
    len=len@entry=8) at memcmp.c:343
#7  0x006415c0 in weakmap_add (wm=0xa7c23aa0, obj=obj@entry=0xa741e748, 
    obj_size=obj_size@entry=28) at ../tests/disclaim_weakmap_test.c:189
#8  0x00641a48 in pair_new (car=car@entry=0xa7c25ec4, cdr=cdr@entry=0xa7c25fe4)
    at ../tests/disclaim_weakmap_test.c:359
#9  0x00641d60 in test (data=0x0) at ../tests/disclaim_weakmap_test.c:409
#10 0x005e9208 in GC_inner_start_routine (
    sb=<error reading variable: value has been optimized out>, 
    arg=<error reading variable: value has been optimized out>)
    at ../pthread_start.c:57
#11 0x005d8538 in GC_call_with_stack_base (fn=<optimized out>, 
    arg=<optimized out>) at ../extra/../misc.c:2176
#12 0x005d859c in GC_start_routine (arg=<optimized out>)
    at ../extra/../pthread_support.c:2198
#13 0x004204a4 in start_thread (arg=0xa741f420) at pthread_create.c:444
#14 0x004bd5c4 in clone ()
    at ../sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S:78

Source of frame #4:

      ABORT_ARG1("Unexpected bus error or segmentation fault",     <== line 3416
                 " at %p", (void *)addr);

Source of frame #6 (glibc-2.37/string/memcmp.c):

  /* There are just a few bytes to compare.  Use byte memory operations.  */
  while (len != 0)                                                 <== line 343
    {
      a0 = ((byte *) srcp1)[0];
      b0 = ((byte *) srcp2)[0];
      srcp1 += 1;
      srcp2 += 1;
      res = a0 - b0;
      if (res != 0)
        return res;
      len -= 1;
    }

@bhaible
Copy link
Contributor Author

bhaible commented Jan 23, 2024

Is this reproducible on release-8_2 or master branch?

Yes, I reproduce the same failures with the tip of the release-8_2 branch.

@bhaible
Copy link
Contributor Author

bhaible commented Jan 23, 2024

Is this reproducible with environment variable GC_DISABLE_INCREMENTAL=1 ?

No. With the environment variable setting GC_DISABLE_INCREMENTAL=1, all tests pass.

@ivmai
Copy link
Owner

ivmai commented Jan 24, 2024

Might be related to #370

@ivmai
Copy link
Owner

ivmai commented Jan 24, 2024

It seems somehow some SIGSEGV (or SIGBUS) signal is lost.

Is this reproducible on a real H/W (not Qemu)?

@ivmai
Copy link
Owner

ivmai commented Jan 24, 2024

Try to reproduce the issue with --disable-munmap --disable-threads flags (passed to configure), then please also add --enable-checksums (in addition to the former 2 ones) - it will force libgc to perform checking of mprotect-based VDB (and report errors if any found).

@bhaible
Copy link
Contributor Author

bhaible commented Jan 26, 2024

Try to reproduce the issue with --disable-munmap --disable-threads flags (passed to configure)

The issue is reproducible with --disable-munmap --disable-threads:

$ cat gctest.log
Switched to incremental mode
Emulating dirty bits with mprotect/signals
Unexpected bus error or segmentation fault at 0x5c
Unexpected bus error or segmentation fault
Child process failed, pid= 8575, status= 0x86
Test failed
FAIL gctest (exit status: 134)
$ cat disclaim_weakmap_test.log
Unexpected bus error or segmentation fault at 0xfe56305f
Unexpected bus error or segmentation fault
FAIL disclaim_weakmap_test (exit status: 134)

please also add --enable-checksums

When I do this, gctest does not terminate within 1 day:

$ cat gctest.log
Switched to incremental mode
Emulating dirty bits with mprotect/signals
Found 1 dirty bit errors (0 were faulted)
Found 2 dirty bit errors (0 were faulted)
Found 11 dirty bit errors (0 were faulted)
Found 1 dirty bit errors (0 were faulted)
Found 1 dirty bit errors (0 were faulted)

@ivmai
Copy link
Owner

ivmai commented Jan 26, 2024

Found 1 dirty bit errors (0 were faulted)

This also means a bug exists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants