Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sockperf crashed due to segment fault when using libvma #216

Open
g199209 opened this issue Apr 23, 2023 · 13 comments
Open

sockperf crashed due to segment fault when using libvma #216

g199209 opened this issue Apr 23, 2023 · 13 comments

Comments

@g199209
Copy link

g199209 commented Apr 23, 2023

4716b150645180735fdbf816adede0a39dc64e86 issue 3109144: Adding Unix Domain Socket support in SOCK_STREAM and SOCK_DGRAM
caused the bug.

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000000000957c15 in __gnu_cxx::__exchange_and_add (__val=0xffffffff, __mem=0xfffffffffffffff8) at /opt/rh/devtoolset-11/root/usr/include/c++/11/ext/atomicity.h:66
66	  { return __atomic_fetch_add(__mem, __val, __ATOMIC_ACQ_REL); }
Missing separate debuginfos, use: debuginfo-install glibc-2.17-260.el7_6.3.x86_64 libgcc-4.8.5-36.el7.x86_64 libibverbs-41mlnx1-OFED.4.5.0.1.0.45101.x86_64 libnl3-3.2.28-4.el7.x86_64 librdmacm-41mlnx1-OFED.4.2.0.1.3.45101.x86_64 libstdc++-4.8.5-36.el7.x86_64 numactl-libs-2.0.9-7.el7.x86_64
gef> bt
#0  0x0000000000957c15 in __gnu_cxx::__exchange_and_add (__val=0xffffffff, __mem=0xfffffffffffffff8) at /opt/rh/devtoolset-11/root/usr/include/c++/11/ext/atomicity.h:66
#1  __gnu_cxx::__exchange_and_add_dispatch (__val=0xffffffff, __mem=0xfffffffffffffff8) at /opt/rh/devtoolset-11/root/usr/include/c++/11/ext/atomicity.h:101
#2  std::string::_Rep::_M_dispose (__a=..., this=0xffffffffffffffe8) at /opt/rh/devtoolset-11/root/usr/include/c++/11/bits/basic_string.h:3348
#3  std::string::_Rep::_M_dispose (__a=..., this=0xffffffffffffffe8) at /opt/rh/devtoolset-11/root/usr/include/c++/11/bits/basic_string.h:3332
#4  std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string (this=0x7f4dcb787570, __in_chrg=<optimized out>) at /opt/rh/devtoolset-11/root/usr/include/c++/11/bits/basic_string.h:3768
#5  IPAddress::~IPAddress (this=0x7f4dcb787558, __in_chrg=<optimized out>) at src/ip_address.h:54
#6  user_params_t::~user_params_t (this=0x7f4dcb787500, __in_chrg=<optimized out>) at src/defs.h:692
#7  0x00007f4dca418eda in __cxa_finalize () from /lib64/libc.so.6
#8  0x00007f4dcb44a353 in ?? ()
#9  0x00007ffcd19e2c90 in ?? ()
#10 0x00007f4dcb79bfba in _dl_fini () from /lib64/ld-linux-x86-64.so.2
Backtrace stopped: frame did not save the PC

build enviroment : gcc 8.3, linux x86_64

test commnad: ./sockperf pp -i xxx.xxx.xxx.xxx --load-vma

@igor-ivanov @EldarShalev

@g199209
Copy link
Author

g199209 commented Apr 23, 2023

The v5.8-2.0.3.0 LTS MLNX_OFED driver contains v3.10 sockperf, it also crashed.

@igor-ivanov
Copy link
Collaborator

Hello @g199209.
Could you clarify

  • the issue is reproduced with 4716b150645180735fdbf816adede0a39dc64e86
  • the issue does not seen before 4716b150645180735fdbf816adede0a39dc64e86
  • the issue exists in sockperf v3.10 too
  • libvma is taken from v5.8-2.0.3.0 LTS MLNX_OFED

Is it correct?

@g199209
Copy link
Author

g199209 commented Apr 27, 2023

Hello @g199209. Could you clarify

  • the issue is reproduced with 4716b150645180735fdbf816adede0a39dc64e86
  • the issue does not seen before 4716b150645180735fdbf816adede0a39dc64e86
  • the issue exists in sockperf v3.10 too
  • libvma is taken from v5.8-2.0.3.0 LTS MLNX_OFED

Is it correct?

Yes.

@g199209
Copy link
Author

g199209 commented May 9, 2023

Is there any plan to fix this bug?

@g199209
Copy link
Author

g199209 commented Nov 6, 2023

@igor-ivanov Any progress?

@igor-ivanov
Copy link
Collaborator

Hello @g199209,

The failure you described does not happen on my setup.
libvma(v9.7.2) - from MLNX OFED 5.8-2.0.3.0
sockperf(version #3.8-21.git4716b1506451)

server:

$ sudo sockperf sr -i 192.168.105.3 --load-vma=libvma.so
 VMA INFO: ---------------------------------------------------------------------------
 VMA INFO: VMA_VERSION: 9.7.2-1 Release built on Nov 14 2022 17:03:52
 VMA INFO: Cmd Line: sockperf sr -i 192.168.105.3 --load-vma=libvma.so
 VMA INFO: OFED Version: MLNX_OFED_LINUX-5.9-0.5.5.2:
 VMA INFO: ---------------------------------------------------------------------------
 VMA INFO: Log Level                      INFO                       [VMA_TRACELEVEL]
 VMA INFO: ---------------------------------------------------------------------------
sockperf: == version #3.8-21.git4716b1506451 ==
sockperf: [SERVER] listen on:
[ 0] IP = 192.168.105.3   PORT = 11111 # UDP
sockperf: Warmup stage (sending a few dummy messages)...
sockperf: [tid 1570954] using recvfrom() to block on socket(s)
^Csockperf: Test end (interrupted by user)
sockperf: Total 204243 messages received and handled
sockperf: cleanupAfterLoop() exit

client:

$ sudo sockperf pp -i 192.168.105.3 --load-vma=libvma.so
 VMA INFO: ---------------------------------------------------------------------------
 VMA INFO: VMA_VERSION: 9.7.2-1 Release built on Nov 14 2022 17:03:52
 VMA INFO: Cmd Line: sockperf pp -i 192.168.105.3 --load-vma=libvma.so
 VMA INFO: OFED Version: MLNX_OFED_LINUX-5.9-0.5.5.2:
 VMA INFO: ---------------------------------------------------------------------------
 VMA INFO: Log Level                      INFO                       [VMA_TRACELEVEL]
 VMA INFO: ---------------------------------------------------------------------------
sockperf: == version #3.8-21.git4716b1506451 ==
sockperf[CLIENT] send on:sockperf: using recvfrom() to block on socket(s)

[ 0] IP = 192.168.105.3   PORT = 11111 # UDP
sockperf: Warmup stage (sending a few dummy messages)...
sockperf: Starting test...
sockperf: Test end (interrupted by timer)
sockperf: Test ended
sockperf: [Total Run] RunTime=1.000 sec; Warm up time=400 msec; SentMessages=204243; ReceivedMessages=204242
sockperf: ========= Printing statistics for Server No: 0
sockperf: [Valid Duration] RunTime=0.550 sec; SentMessages=126961; ReceivedMessages=126961
sockperf: ====> avg-latency=2.155 (std-dev=0.271, mean-ad=0.087, median-ad=0.103, siqr=0.075, cv=0.126, std-error=0.001, 99.0% ci=[2.153, 2.157])
sockperf: # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
sockperf: Summary: Latency is 2.155 usec
sockperf: Total 126961 observations; each percentile contains 1269.61 observations
sockperf: ---> <MAX> observation =   48.565
sockperf: ---> percentile 99.999 =   45.589
sockperf: ---> percentile 99.990 =    4.478
sockperf: ---> percentile 99.900 =    2.634
sockperf: ---> percentile 99.000 =    2.374
sockperf: ---> percentile 90.000 =    2.264
sockperf: ---> percentile 75.000 =    2.224
sockperf: ---> percentile 50.000 =    2.184
sockperf: ---> percentile 25.000 =    2.073
sockperf: ---> <MIN> observation =    1.823

It does not happen with sockperf v3.10 too.
Please consider compiling current libvma (https://github.com/Mellanox/libvma) and sockperf from sources and check the failure case on your setup.

@igor-ivanov
Copy link
Collaborator

@g199209 please share more details to reproduce the issue or close it in case it is not observed.

@g199209
Copy link
Author

g199209 commented Nov 17, 2023

@g199209 please share more details to reproduce the issue or close it in case it is not observed.

I'll compile libvma from source and test again these days asap.

@g199209
Copy link
Author

g199209 commented Nov 21, 2023

@igor-ivanov My colleague have tried to build libvma, but failed: Mellanox/libvma#1053

@igor-ivanov
Copy link
Collaborator

@igor-ivanov My colleague have tried to build libvma, but failed: Mellanox/libvma#1053

From my understanding it succeeded to build but could not find a way to create package.

@g199209
Copy link
Author

g199209 commented Nov 23, 2023

@igor-ivanov My colleague have tried to build libvma, but failed: Mellanox/libvma#1053

From my understanding it succeeded to build but could not find a way to create package.

We suspect that this may not be the correct way to build. Do you also use it to build and create packages manually later?

@igor-ivanov
Copy link
Collaborator

igor-ivanov commented Nov 24, 2023

@igor-ivanov My colleague have tried to build libvma, but failed: Mellanox/libvma#1053

From my understanding it succeeded to build but could not find a way to create package.

We suspect that this may not be the correct way to build. Do you also use it to build and create packages manually later?

What issue do you see with build_pkg.sh? It is used in practice now.

@igor-ivanov
Copy link
Collaborator

@g199209 have you had a chance for the issue reproduction?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants