You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007f4c14be818c, pid=92950, tid=0x00007f4bd2649700
#
# JRE version: OpenJDK Runtime Environment (8.0_342-b07) (build 1.8.0_342-b07)
# Java VM: OpenJDK 64-Bit Server VM (25.342-b07 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C [libNRJavaSerial.so+0x818c] Java_gnu_io_RXTXPort_interruptEventLoop+0x4c
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# If you would like to submit a bug report, please visit:
# https://github.com/corretto/corretto-8/issues/
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#
--------------- T H R E A D ---------------
Current thread (0x00007f4bf063b800): JavaThread "SpringContextShutdownHook" daemon [_thread_in_native, id=621408, stack(0x00007f4bd2611000,0x00007f4bd264a000)]
siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x0000000d00000000
Registers:
RAX=0x00000000000000c3, RBX=0x00000000000000c3, RCX=0x0000000000000000, RDX=0x0000000d00000000
RSP=0x00007f4bd2647bd0, RBP=0x00007f4c14dee380, RSI=0x00007f4c14beaad6, RDI=0x00007f4c14beaad6
R8 =0x00007f4c3529d6e0, R9 =0x000000000000000b, R10=0x00007f4c14ffb2bc, R11=0x00007f4c34cb3b80
R12=0x00007f4c14beaf23, R13=0x00007f4c14ffdc80, R14=0x00007f4bd2647c60, R15=0x00007f4bf063b800
RIP=0x00007f4c14be818c, EFLAGS=0x0000000000010206, CSGSFS=0x002b000000000033, ERR=0x0000000000000004
TRAPNO=0x000000000000000e
Top of Stack: (sp=0x00007f4bd2647bd0)
0x00007f4bd2647bd0: 00007f4c14ffdc80 00007f4bd2647c40
0x00007f4bd2647be0: 0000000000000000 00007f4c272a01a8
0x00007f4bd2647bf0: 00007f4bd2647c40 00007f4c299421a6
0x00007f4bd2647c00: 00007f4bd2647c00 0000000000000000
0x00007f4bd2647c10: 00007f4bd2647c60 00007f4c14fc5f30
0x00007f4bd2647c20: 0000000000000000 00007f4c14ffdc80
0x00007f4bd2647c30: 0000000000000000 00007f4bd2647c60
0x00007f4bd2647c40: 00007f4bd2647ca8 00007f4c2728b09d
0x00007f4bd2647c50: 0000000000000000 00007f4c27299a78
0x00007f4bd2647c60: 00000000fc6b3f20 00007f4bd2647c68
0x00007f4bd2647c70: 00007f4c14ffe256 00007f4bd2647cc0
0x00007f4bd2647c80: 00007f4c14fc5f30 0000000000000000
0x00007f4bd2647c90: 00007f4c14ffe2d8 00007f4bd2647c60
0x00007f4bd2647ca0: 00007f4bd2647cc0 00007f4bd2647d08
0x00007f4bd2647cb0: 00007f4c2728b09d 0000000000000000
0x00007f4bd2647cc0: 00000000fc6b3f20 00007f4bd2647cc8
0x00007f4bd2647cd0: 00007f4c14ffee03 00007f4bd2647d28
0x00007f4bd2647ce0: 00007f4c14fc5f30 0000000000000000
0x00007f4bd2647cf0: 00007f4c14ffeed8 00007f4bd2647cc0
0x00007f4bd2647d00: 00007f4bd2647d28 00007f4bd2647d70
0x00007f4bd2647d10: 00007f4c2728b09d 0000000000000000
0x00007f4bd2647d20: 0000000000000000 00000000fc6b3f20
0x00007f4bd2647d30: 00007f4bd2647d30 00007f4c15867f52
0x00007f4bd2647d40: 00007f4bd2647d88 00007f4c14fb6bb0
0x00007f4bd2647d50: 0000000000000000 00007f4c14fef028
0x00007f4bd2647d60: 00007f4bd2647d28 00007f4bd2647d88
0x00007f4bd2647d70: 00007f4bd2647dd0 00007f4c2728b09d
0x00007f4bd2647d80: 0000000000000000 00000000fc699b60
0x00007f4bd2647d90: 00007f4bd2647d90 00007f4c153df16c
0x00007f4bd2647da0: 00007f4bd2647de8 00007f4c153df5d0
0x00007f4bd2647db0: 0000000000000000 00007f4c153df1c8
0x00007f4bd2647dc0: 00007f4bd2647d88 00007f4bd2647de8
Instructions: (pc=0x00007f4c14be818c)
0x00007f4c14be816c: 00 00 00 00 48 8b 55 00 48 85 d2 75 13 eb 3d 0f
0x00007f4c14be817c: 1f 44 00 00 48 8b 92 c8 00 00 00 48 85 d2 74 34
0x00007f4c14be818c: 39 1a 75 f0 c7 82 98 00 00 00 01 00 00 00 48 8d
0x00007f4c14be819c: 3d 87 38 00 00 c7 82 68 01 00 00 01 00 00 00 5b
Register to memory mapping:
RAX=0x00000000000000c3 is an unknown value
RBX=0x00000000000000c3 is an unknown value
RCX=0x0000000000000000 is an unknown value
RDX=0x0000000d00000000 is an unknown value
RSP=0x00007f4bd2647bd0 is pointing into the stack for thread: 0x00007f4bf063b800
RBP=0x00007f4c14dee380: master_index+0 in /var/lib/tomcat8/temp/libNRJavaSerial_tomcat8_0/libNRJavaSerial.so at 0x00007f4c14be0000
RSI=0x00007f4c14beaad6: <offset 0xaad6> in /var/lib/tomcat8/temp/libNRJavaSerial_tomcat8_0/libNRJavaSerial.so at 0x00007f4c14be0000
RDI=0x00007f4c14beaad6: <offset 0xaad6> in /var/lib/tomcat8/temp/libNRJavaSerial_tomcat8_0/libNRJavaSerial.so at 0x00007f4c14be0000
R8 =0x00007f4c3529d6e0: <offset 0x10086e0> in /usr/lib/jvm/jre1.8/jre/lib/amd64/server/libjvm.so at 0x00007f4c34295000
R9 =0x000000000000000b is an unknown value
R10=0x00007f4c14ffb2bc is pointing into metadata
R11=0x00007f4c34cb3b80: <offset 0xa1eb80> in /usr/lib/jvm/jre1.8/jre/lib/amd64/server/libjvm.so at 0x00007f4c34295000
R12=0x00007f4c14beaf23: <offset 0xaf23> in /var/lib/tomcat8/temp/libNRJavaSerial_tomcat8_0/libNRJavaSerial.so at 0x00007f4c14be0000
R13={method} {0x00007f4c14ffdc80} 'interruptEventLoop' '()V' in 'gnu/io/RXTXPort'
R14=0x00007f4bd2647c60 is pointing into the stack for thread: 0x00007f4bf063b800
R15=0x00007f4bf063b800 is a thread
Stack: [0x00007f4bd2611000,0x00007f4bd264a000], sp=0x00007f4bd2647bd0, free space=218k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C [libNRJavaSerial.so+0x818c] Java_gnu_io_RXTXPort_interruptEventLoop+0x4c
Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j gnu.io.RXTXPort.interruptEventLoop()V+0
j gnu.io.RXTXPort.removeEventListener()V+62
j gnu.io.RXTXPort.close()V+99
j gnu.io.NRSerialPort.disconnect()V+18
j ***redacted***
j ***redacted***
j ***redacted***
v ~StubRoutines::call_stub
J 1834 sun.reflect.NativeMethodAccessorImpl.invoke0(Ljava/lang/reflect/Method;Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object; (0 bytes) @ 0x00007f4c2795f2f7 [0x00007f4c2795f280+0x77]
j sun.reflect.NativeMethodAccessorImpl.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object;+100
J 11853 C2 java.lang.reflect.Method.invoke(Ljava/lang/Object;[Ljava/lang/Object;)Ljava/lang/Object; (62 bytes) @ 0x00007f4c2955c1d4 [0x00007f4c2955c120+0xb4]
j org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor$LifecycleElement.invoke(Ljava/lang/Object;)V+16
j org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor$LifecycleMetadata.invokeDestroyMethods(Ljava/lang/Object;Ljava/lang/String;)V+122
j org.springframework.beans.factory.annotation.InitDestroyAnnotationBeanPostProcessor.postProcessBeforeDestruction(Ljava/lang/Object;Ljava/lang/String;)V+12
j org.springframework.beans.factory.support.DisposableBeanAdapter.destroy()V+48
j org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroyBean(Ljava/lang/String;Lorg/springframework/beans/factory/DisposableBean;)V+133
j org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroySingleton(Ljava/lang/String;)V+41
j org.springframework.beans.factory.support.DefaultListableBeanFactory.destroySingleton(Ljava/lang/String;)V+2
j org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroySingletons()V+105
j org.springframework.beans.factory.support.DefaultListableBeanFactory.destroySingletons()V+1
j org.springframework.context.support.AbstractApplicationContext.destroyBeans()V+4
j org.springframework.context.support.AbstractApplicationContext.doClose()V+133
j org.springframework.context.support.AbstractApplicationContext$1.run()V+14
v ~StubRoutines::call_stub
--------------- P R O C E S S ---------------
Java Threads: ( => current thread )
=>0x00007f4bf063b800 JavaThread "SpringContextShutdownHook" daemon [_thread_in_native, id=621408, stack(0x00007f4bd2611000,0x00007f4bd264a000)]
0x00007f4c04026800 JavaThread "SIGTERM handler" daemon [_thread_blocked, id=621404, stack(0x00007f4bd445b000,0x00007f4bd4494000)]
0x00007f4c18294800 JavaThread "RXTXPortMonitor(/dev/ttyS0)" daemon [_thread_in_native, id=128537, stack(0x00007f4bd44cd000,0x00007f4bd4506000)]
[241 more Java threads omitted]
[details omitted]
--------------- S Y S T E M ---------------
OS:PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
NAME="Debian GNU/Linux"
VERSION_ID="11"
VERSION="11 (bullseye)"
VERSION_CODENAME=bullseye
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"
uname:Linux 5.10.0-27-amd64 #1 SMP Debian 5.10.205-2 (2023-12-31) x86_64
libc:glibc 2.31 NPTL 2.31
rlimit: STACK 8192k, CORE 0k, NPROC 6141, NOFILE 524288, AS infinity
load average:0,42 0,54 0,45
[details omitted]
Memory: 4k page, physical 1607956k(77304k free), swap 2440188k(1500068k free)
vm_info: OpenJDK 64-Bit Server VM (25.342-b07) for linux-amd64 JRE (1.8.0_342-b07), built on Jul 27 2022 21:29:28 by "jenkins" with gcc 7.3.1 20180303 (Red Hat 7.3.1-5)
time: Thu Feb 8 09:24:33 2024
timezone: CET
elapsed time: 57181.688148 seconds (0d 15h 53m 1s)
Observations
Note that the address being dereferenced doesn’t even remotely resemble a memory location nor a near-zero value, but instead looks more like unrelated, non-address data:
In other words, while navigating the chain of event_info_structs, the code in interruptEventLoop appears to be bumping into a garbage next pointer and tries to dereference it, causing the crash.
Possibly related to issue #197, which is reporting an infinite loop rather than a crash.
Possible explanation
If I’m reading the source code correctly, each event_info_struct is owned by its monitor thread, and lives in the stack frame of that thread’s eventLoop.
I couldn’t ultimately figure out how garbage could be possibly ending up in the middle of an event_info_struct.
But a possible explanation would be: a monitor thread might be dying before it gets to bail out of its event_loop. Or maybe a TERM or some other signal is beaming the thread out of its event_loop. Either way, the event_loop never gets to run finalize_event_info_struct. This causes a stale pointer to remain in the global chain, even after the stack frame of event_loop is unrolled. The memory might later be reused by another stack frame or stack.
This theory could explain why during a subsequent invocation of interruptEventLoop in some other thread, while navigating the global chain of event_info_structs, the code might be taking some piece of stack garbage as the real thing.
All that is just speculation from static analysis of the code. But if my theory happens to be correct, it might also explain the stalling reported in issue #197; if the stack segment of a reaped monitor thread is immediately reused by another monitor thread, the stack frame address of the new event_loop might perfectly align with the stale stack frame from the past thread’s event_loop, which means its event_info_struct pointer might end up twice in the global chain. This would cause a cycle in the chain, and a subsequent call to interruptEventLoop from a Java thread might become stuck in an infinite cycle.
Fix
Preliminary testing suggests that PR #211 might be fixing this.
The text was updated successfully, but these errors were encountered:
Summary
While shutting down the JVM in a heavily multithreaded context, we’ve been encountering crashes in
interruptEventLoop
:Details
Observations
Note that the address being dereferenced doesn’t even remotely resemble a memory location nor a near-zero value, but instead looks more like unrelated, non-address data:
The instruction pointer at the time of the crash:
points to this piece of code (obtained via
objdump -d
):Details
Which in turn matches the
index->fd != fd
expression here:nrjavaserial/src/main/c/src/SerialImp.c
Lines 4935 to 4936 in 0df8b60
In other words, while navigating the chain of
event_info_struct
s, the code ininterruptEventLoop
appears to be bumping into a garbagenext
pointer and tries to dereference it, causing the crash.Possibly related to issue #197, which is reporting an infinite loop rather than a crash.
Possible explanation
If I’m reading the source code correctly, each
event_info_struct
is owned by its monitor thread, and lives in the stack frame of that thread’seventLoop
.I couldn’t ultimately figure out how garbage could be possibly ending up in the middle of an
event_info_struct
.But a possible explanation would be: a monitor thread might be dying before it gets to bail out of its
event_loop
. Or maybe aTERM
or some other signal is beaming the thread out of itsevent_loop
. Either way, theevent_loop
never gets to runfinalize_event_info_struct
. This causes a stale pointer to remain in the global chain, even after the stack frame ofevent_loop
is unrolled. The memory might later be reused by another stack frame or stack.This theory could explain why during a subsequent invocation of
interruptEventLoop
in some other thread, while navigating the global chain ofevent_info_struct
s, the code might be taking some piece of stack garbage as the real thing.All that is just speculation from static analysis of the code. But if my theory happens to be correct, it might also explain the stalling reported in issue #197; if the stack segment of a reaped monitor thread is immediately reused by another monitor thread, the stack frame address of the new
event_loop
might perfectly align with the stale stack frame from the past thread’sevent_loop
, which means itsevent_info_struct
pointer might end up twice in the global chain. This would cause a cycle in the chain, and a subsequent call tointerruptEventLoop
from a Java thread might become stuck in an infinite cycle.Fix
Preliminary testing suggests that PR #211 might be fixing this.
The text was updated successfully, but these errors were encountered: