Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libfabric: Valgrind reports some still reachable bytes in usdf_fabric.c #6865

Open
dqwu opened this issue Jan 3, 2024 · 3 comments
Open

Comments

@dqwu
Copy link

dqwu commented Jan 3, 2024

This issue is reproducible on most ANL CELS GCE nodes with Ubuntu 20 (e.g. compute-10.cels.anl.gov)

gcc --version
gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0

Steps to reproduce with latest mpich main branch:

git clone https://github.com/pmodels/mpich.git
cd mpich

git submodule update --init
./autogen.sh

CC=gcc CXX=g++ FC=gfortran CFLAGS="-g" ./configure \
--prefix=/path/to/mpich/installation --disable-fortran --with-hwloc=embedded --disable-psm3
make -j8
make install

export PATH=/path/to/mpich/installation/bin:$PATH

cat <<EOF >> test_mpi.c
#include <mpi.h>
int main(int argc, char *argv[])
{
  MPI_Init(&argc, &argv);
  MPI_Finalize();

  return 0;
}
EOF

mpicc -g test_mpi.c
mpiexec -n 2 valgrind --leak-check=full --show-leak-kinds=all ./a.out

Sample output:

==1763688== HEAP SUMMARY:
==1763688==     in use at exit: 1,760 bytes in 2 blocks
==1763688==   total heap usage: 10,974 allocs, 10,972 frees, 19,682,880 bytes allocated
==1763688== 
==1763688== 24 bytes in 1 blocks are still reachable in loss record 1 of 2
==1763688==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==1763688==    by 0x7EEC861: verbs_register_driver_25 (in /usr/lib/x86_64-linux-gnu/libibverbs.so.1.8.28.0)
==1763688==    by 0x4011B99: call_init.part.0 (dl-init.c:72)
==1763688==    by 0x4011CA0: call_init (dl-init.c:30)
==1763688==    by 0x4011CA0: _dl_init (dl-init.c:119)
==1763688==    by 0x4001139: ??? (in /usr/lib/x86_64-linux-gnu/ld-2.31.so)
==1763688== 
==1763688== 1,736 bytes in 1 blocks are still reachable in loss record 2 of 2
==1763688==    at 0x483DD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==1763688==    by 0x6CC1E9E: usdf_get_devinfo (usdf_fabric.c:526)
==1763688==    by 0x6CC1E9E: usdf_getinfo (usdf_fabric.c:775)
==1763688==    by 0x6C37B86: fi_getinfo (fabric.c:1240)
==1763688==    by 0x4CA39C5: find_provider (init_provider.c:126)
==1763688==    by 0x4CA39C5: MPIDI_OFI_find_provider (init_provider.c:83)
==1763688==    by 0x4C7F96D: MPIDI_OFI_init_local (ofi_init.c:760)
==1763688==    by 0x4C159C0: MPID_Init (ch4_init.c:551)
==1763688==    by 0x4B37A8B: MPII_Init_thread (mpir_init.c:274)
==1763688==    by 0x4B382C9: MPIR_Init_impl (mpir_init.c:135)
==1763688==    by 0x498ADD5: internal_Init (c_binding.c:48575)
==1763688==    by 0x498ADD5: PMPI_Init (c_binding.c:48626)
==1763688==    by 0x10918E: main (test_mpi.c:4)
==1763688== 
==1763688== LEAK SUMMARY:
==1763688==    definitely lost: 0 bytes in 0 blocks
==1763688==    indirectly lost: 0 bytes in 0 blocks
==1763688==      possibly lost: 0 bytes in 0 blocks
==1763688==    still reachable: 1,760 bytes in 2 blocks
==1763688==         suppressed: 0 bytes in 0 blocks
@dqwu
Copy link
Author

dqwu commented Jan 3, 2024

@hzhou Is this a known issue to you?

@hzhou hzhou changed the title Valgrind reports some still reachable bytes in usdf_fabric.c libfabric: Valgrind reports some still reachable bytes in usdf_fabric.c Jan 3, 2024
@hzhou
Copy link
Contributor

hzhou commented Jan 3, 2024

@hzhou Is this a known issue to you?

Thanks for reporting. Since it is libfabric issue, it is of lower priority for us. Is it important to you to get the library valgrind clean?

@dqwu
Copy link
Author

dqwu commented Jan 3, 2024

@hzhou Is this a known issue to you?

Thanks for reporting. Since it is libfabric issue, it is of lower priority for us. Is it important to you to get the library valgrind clean?

@hzhou No, it is not important to me to get Valgrind clean. I just continue to report confirmed leaks as usual (e.g., #6185). FYI, I think it is similar to some libfabric leaks in t.log in ofiwg/libfabric#8091

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants