Skip to content

Conversation

@jbertram
Copy link
Contributor

@jbertram jbertram commented Oct 30, 2025

This is the latest in a series of related PRs. This replaces #5774.

}
}

public static final boolean isIoUringAvailable() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added via netty/netty#15785 a way to verify the status of the different optimizations made by io_uring on the OS - you can use it for reporting/debugging 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was about to use it, but I see now that it's so recent that it's not in a release yet. 😆


if (useEpoll && CheckDependencies.isEpollAvailable()) {
if (useIoUring && CheckDependencies.isIoUringAvailable()) {
//IO_URING should default to 1 remotingThread unless specified in config
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should use less than we use too with epoll - but 1 could be quite low.

And check netty/netty#15524 if can help to scale up the number of event loops if required

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, setting it to 1 was a recommendation I found back from when netty io_uring was still an incubator project. I did some testing on this when I submitted the previous PR and saw that using 1 thread performed best up to something on the order of a few thousand connections and messages per second. After that it had to be increased but as I recall it, I saw no measurable improvement using more than just a few dedicated threads. Again, this was some time ago but I would guess up to ~5 threads or so on a decently sized server, say 16 cores.

Perhaps it should be set to something like Runtime.getRuntime().availableProcessors() or half that number or similar? Otherwise I guess the scaling option would make sense...

@AntonRoskvist
Copy link
Contributor

Very nice @jbertram !

Coincidentally, I was just about to send a PR for just the Netty upgrade parts of this and was planning to add IO_URING support on top of that later (as in, I have part of the commit message typed out). I stopped shy of submitting it when I noticed the exact same failure I'm seeing here now, on:
org.apache.activemq.artemis.utils.ByteUtilTest#shouldZeroesLimitedDirectByteBuffer() using JDK-25... so this is apparently an issue when using Netty 4.2 with JDK-25 and not with any IO_URING changes... I haven't figured out what the issue is yet though.

@jbertram
Copy link
Contributor Author

@AntonRoskvist, yeah, I just noticed that as well. The full test-suite passes on this 100% but it was using JDK 17. Will need to do some investigation. Let me know if you make any determinations.

@jbertram
Copy link
Contributor Author

jbertram commented Oct 30, 2025

@franz1981, on Java 25 with Netty 4.2.7 we're seeing this:

Error:  org.apache.activemq.artemis.utils.ByteUtilTest.shouldZeroesLimitedDirectByteBuffer -- Time elapsed: 0.002 s <<< ERROR!
java.lang.IndexOutOfBoundsException
	at java.base/java.nio.Buffer$1.apply(Buffer.java:756)
	at java.base/java.nio.Buffer$1.apply(Buffer.java:753)
	at java.base/jdk.internal.util.Preconditions$4.apply(Preconditions.java:213)
	at java.base/jdk.internal.util.Preconditions$4.apply(Preconditions.java:210)
	at java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:98)
	at java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:106)
	at java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:302)
	at java.base/java.nio.Buffer.checkIndex(Buffer.java:767)
	at java.base/java.nio.DirectByteBuffer.put(DirectByteBuffer.java:373)
	at org.apache.activemq.artemis.utils.ByteUtil.uncheckedZeros(ByteUtil.java:515)
	at org.apache.activemq.artemis.utils.ByteUtil.zeros(ByteUtil.java:497)
	at org.apache.activemq.artemis.utils.ByteUtilTest.shouldZeroesByteBuffer(ByteUtilTest.java:176)
	at org.apache.activemq.artemis.utils.ByteUtilTest.shouldZeroesLimitedDirectByteBuffer(ByteUtilTest.java:234)

The org.apache.activemq.artemis.utils.ByteUtil.uncheckedZeros is something you wrote 7 years ago and at first glance it looks like the dark arts. Do you have any idea why this would be failing?

@franz1981
Copy link
Contributor

@jbertram

2 things:

  1. https://github.com/apache/activemq-artemis/blob/1c1924c18b4ae6b00edbac7d3b210438c99ff5da/artemis-commons/src/test/java/org/apache/activemq/artemis/utils/ByteUtilTest.java#L234 cannot work if there's no Unsafe available and direct ByteBuffer so you can add an assumeTrue(PlatformDependent.hasUnsafe() to the test
  2. If you go there it means that you are not starting artemis or the test with the "right" parameters to still use Unsafe in a JAVA 25 world...

The former worry me a bit because it is basically saying "I can zero a buffer which is limiting how much i can write into ti" and I don't remember why I've done it TBH .-.

The latter can be addressed by https://netty.io/wiki/java-24-and-sun.misc.unsafe.html#netty-42-and-unsafe i.e. preferred approach and configuration.
But you can still enable Netty to use Unsafe with --sun-misc-unsafe-memory-access=allow AFAIK: give it a shot!

@jbertram jbertram force-pushed the ARTEMIS-3163 branch 3 times, most recently from 4b608c8 to 874b754 Compare October 30, 2025 17:52
@jbertram
Copy link
Contributor Author

@franz1981, thanks! That got me moving in the right direction. 👍

@jbertram jbertram force-pushed the ARTEMIS-3163 branch 5 times, most recently from 92e643e to 42571ec Compare October 31, 2025 01:04
@jbertram jbertram marked this pull request as draft October 31, 2025 01:58
@franz1981
Copy link
Contributor

franz1981 commented Oct 31, 2025

This is a fyi : netty/netty#14432

In theory is possible to extend the existing Uring support to includes files too, which enable faster performance then libaio since it allows using async fdatasync.
In order to do it:

  • you can have a low latency single threaded netty event loop just handling disk I/O
  • Artemis can reuse the existing configured event loop groups

This can lead to few simplification including replacing the timed buffer with a batching mechanism running directly on the event loop - which accumulate them while waiting the last in flight fdatasync to complete - or the expected time credits to expire

@AntonRoskvist
Copy link
Contributor

AntonRoskvist commented Oct 31, 2025

@franz1981

2. If you go there it means that you are not starting artemis or the test with the "right" parameters to still use `Unsafe` in a JAVA 25 world...

I don't fully understand the implication of this... say a user starts up a version of the broker using Netty 4.2, JDK 25 and does not allow Unsafe access... would they see issues with this particular test or could they wind up hitting a java.lang.IndexOutOfBoundsException in a running brokers regular handling of ByteBuffers?

I'm guessing it's specific to the test otherwise there should be some explicit check for these conditions when starting up the broker...

@franz1981
Copy link
Contributor

IDK to be fair, that's why running a proper unsafe-forcibly disabled broker with NIO journal would reveal it

@AntonRoskvist
Copy link
Contributor

AntonRoskvist commented Oct 31, 2025

This affects more things it seems... I just built this locally, then ran:

$ artemis-distribution/target/apache-artemis-2.45.0-SNAPSHOT-bin/apache-artemis-2.45.0-SNAPSHOT/bin/artemis create /home/$USERNAME/broker

Creating ActiveMQ Artemis instance at: /home/$USERNAME/broker

.......

Auto tuning journal ...
done! Your system can make 250 writes per millisecond, your journal-buffer-timeout will be 4000
java.lang.UnsupportedOperationException: Cannot clean arbitrary ByteBuffer instances
at io.netty.util.internal.CleanerJava24Linker.freeDirectBuffer(CleanerJava24Linker.java:203)
at io.netty.util.internal.PlatformDependent.freeDirectBuffer(PlatformDependent.java:613)
at org.apache.activemq.artemis.core.io.nio.NIOSequentialFileFactory.releaseDirectBuffer(NIOSequentialFileFactory.java:138)
at org.apache.activemq.artemis.core.io.nio.NIOSequentialFileFactory.releaseBuffer(NIOSequentialFileFactory.java:161)
at org.apache.activemq.artemis.core.io.nio.NIOSequentialFile.fill(NIOSequentialFile.java:190)
at org.apache.activemq.artemis.cli.commands.util.SyncCalculation.syncTest(SyncCalculation.java:114)
at org.apache.activemq.artemis.cli.commands.Create.performAutoTune(Create.java:1113)
at org.apache.activemq.artemis.cli.commands.Create.run(Create.java:878)
at org.apache.activemq.artemis.cli.commands.Create.execute(Create.java:561)
at org.apache.activemq.artemis.cli.Artemis.internalExecute(Artemis.java:222)
at org.apache.activemq.artemis.cli.Artemis.execute(Artemis.java:170)
at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
at java.base/java.lang.reflect.Method.invoke(Method.java:565)
at org.apache.activemq.artemis.boot.Artemis.execute(Artemis.java:149)
at org.apache.activemq.artemis.boot.Artemis.main(Artemis.java:61)
Couldn't perform sync calculation, using default values

You can now start the broker by executing:

.......

Java version:

$ java -version
openjdk version "25" 2025-09-16 LTS
OpenJDK Runtime Environment Temurin-25+36 (build 25+36-LTS)
OpenJDK 64-Bit Server VM Temurin-25+36 (build 25+36-LTS, mixed mode, sharing)

@jbertram
Copy link
Contributor Author

@franz1981, reading the doc you cited it seemed at first to me that all the broker would need is --enable-native-access=ALL-UNNAMED, but apparently it needs --sun-misc-unsafe-memory-access=allow as well. Is that expected?

@jbertram
Copy link
Contributor Author

@AntonRoskvist, my last push contains some scripting to detect the JDK and add --sun-misc-unsafe-memory-access=allow when necessary.

mv $ARTEMIS_OOME_DUMP $ARTEMIS_OOME_DUMP.bkp
fi

JAVA_VERSION=`$JAVACMD -version 2>&1 | head -n 1 | awk -F '"' '{print $2}' | cut -d'.' -f1`
Copy link
Contributor

@AntonRoskvist AntonRoskvist Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So probably this won't be final but this can be simplified a bit, doing something like:
$JAVACMD --sun-misc-unsafe-memory-access=allow --version >/dev/null && ALLOW_UNSAFE="--sun-misc-unsafe-memory-access=allow"

in essence, $JAVACMD --sun-misc-unsafe-memory-access=allow --version returns 0 if the version is > 24 and 1 otherwise, meaning if the option can be set, it will get set.

Could be set as part of an if statement for clarity as well...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this a lot better than what I was using before. The main benefit is that it doesn't rely on head, awk, or cut which might not be available (especially in a container).

@AntonRoskvist
Copy link
Contributor

@jbertram yes, looking good.

I think it's a strange behavior though, needing both flags. My understanding was that Netty sorts this out when starting up with the worst side effect being suboptimal performance in case it doesn't have access to Unsafe or other optimizations... I guess I feel that either Netty should catch this (as I'm guessing it has in previous versions), or some proper/robust check should be added to the broker during startup (from within the JVM).

@franz1981
Copy link
Contributor

I am working hard the last months to fix the performance of Netty without Unsafe. And is not finished yet 🙏
But it should be in decent state already..if not, please let me know and I will happily fix what needed

@jbertram
Copy link
Contributor Author

jbertram commented Oct 31, 2025

@franz1981, your efforts are much appreciated!

Could you confirm whether applications using Netty 4.2 with JDK 25 should be using both --enable-native-access=ALL-UNNAMED and --sun-misc-unsafe-memory-access=allow?

It's worth noting that Artemis will already need --enable-native-access=ALL-UNNAMED due to our native libaio integration.

@franz1981
Copy link
Contributor

Iouring and epoll are both using JNI - I suppose that is the reason why you need such.
I remember there are other flags which need to be used as well, on Monday will write them here

mv $ARTEMIS_OOME_DUMP $ARTEMIS_OOME_DUMP.bkp
fi

# Netty needs access to unsafe, but this is turned off in Java 25 by default
Copy link
Member

@gemmellr gemmellr Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment could be confusing/misleading later, since Unsafe itself is not turned off in Java 25 by default. It warns once on first use on 24+ but whats left of Unsafe (some has been removed already) is still usable. Setting --sun-misc-unsafe-memory-access=allow facilitates stopping printing the warning on Java 24+. On future JDKs, e.g perhaps 26+, that will not be permitted and the Unsafe methods will be disabled by default and the allowed fallback will become --sun-misc-unsafe-memory-access=warn to again permit them but always with the warning. On yet further future JDKs the Unsafe memory methods will be removed and --sun-misc-unsafe-memory-access will be ignored. At a later point --sun-misc-unsafe-memory-access will be removed (which may mean its presence prevents startup).

Netty 4.2 doesn't use Unsafe by default in Java 24+ so as to avoid the warning (whereas 4.1 still does use Unsafe by default [excepting 4.1.120-4.1.121], hence the warning mentioned on ARTEMIS-5711 when it uses Unsafe currently). Instead it uses the 'fallback MemorySegment' based cleaner on 25+ (it's broken on 24, so that seems to default to heap-buffers rather than direct buffers) rather than use Unsafe. Its possible to have it use a more performant alternative on 24+ by explicitly enabling native access (which is actually still enabled by default on 24+, for now, but again produces a warning, so Netty 4.2 doesn't do this by default to avoid those warnings, requiring the --enable-native-access flag). Yet alternatively again, you can make it just use Unsafe on Java 24+ [until they prevent 'allow', perhaps JDK26+] by explicitly specifying it can via --sun-misc-unsafe-memory-access=allow, or using the Netty-specific io.netty.noUnsafe prop.

https://github.com/netty/netty/blob/netty-4.2.7.Final/common/src/main/java/io/netty/util/internal/PlatformDependent0.java#L547-L588 plus the monster starting at https://github.com/netty/netty/blob/4.2/common/src/main/java/io/netty/util/internal/PlatformDependent0.java#L85

fi

# Netty needs access to unsafe, but this is turned off in Java 25 by default
$JAVACMD --sun-misc-unsafe-memory-access=allow --version > /dev/null 2>&1 && ALLOW_UNSAFE="--sun-misc-unsafe-memory-access=allow"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If netty is the only thing we need/decide to enable Unsafe for (dont know) on 25+, then I wonder if going with io.netty.noUnsafe sys prop might be preferable, as it at least currently seems like it wont require a second JVM startup and there would be no harm always specifying it on either older JDKs or future JDKs after its irrelevant.

Though, that would leave the warning on 25..

I still wonder about just leaving it not using Unsafe on 25. Seems like that should be possible, and that is the future beyond 25. If thats still not viable, perhaps we stick with 4.1 using Unsafe and its warnings for a bit longer until it is, and just try to ensure compatibility if an end user wants to use 4.2 (the milestone/RC builds of spring using Artemis with 4.2 suggest it mostly already is...thats the only 4.2 usage I'm personally aware of). Anyone not satisifed with no-Unsafe can always add the config themselves.


if [ -z "$JAVA_ARGS" ]; then
JAVA_ARGS="-Dlog4j2.disableJmx=true --add-opens java.base/jdk.internal.misc=ALL-UNNAMED ${java-utility-opts}"
JAVA_ARGS="-Dlog4j2.disableJmx=true --add-opens java.base/jdk.internal.misc=ALL-UNNAMED --enable-native-access=ALL-UNNAMED ${java-utility-opts}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least its the case that --enable-native-access was already in JDK17 (as the initial FFM in 17 was an incubator addition there) so hopefully that means it works on all versions of JDK17+ as opposed to just post-GA backport releases, save us doing any checking.

Comment on lines +242 to +248
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<configuration>
<argLine>${activemq-surefire-argline}</argLine>
</configuration>
</plugin>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this not already the case from the parent?

Comment on lines +207 to +209
String javaVersion = System.getProperty("java.version");
if (javaVersion.startsWith("24") || javaVersion.startsWith("25")) {
commandList.add("--enable-native-access=ALL-UNNAMED");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps Runtime.version().feature(); would work here.

Restricting to just 24 and 25 for --enable-native-access probably doesnt make sense, its going to be useful or needed for all 26+


@Test
public void shouldZeroesLimitedDirectByteBuffer() {
assumeTrue(PlatformDependent.hasUnsafe());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is possibly covering up a bug in either the test or more likely in the ByteUtil that was exposed by Unsafe not being available, and so this doesnt seem like the solution.

I dont think this test should need Unsafe, the behaviour just seems questionable. The very next test calls the same utility method, without Unsafe. It only behaves differently as the buffer passed isnt limited. Real code could call the ByeUtil method later without Unsafe and get unexpected behaviour, it wont have the benefit of covering itself with an assume.

EDIT: after typing that, I reminded myself of the time that this essentially already happened and I fixed it in the calling code:
https://issues.apache.org/jira/browse/ARTEMIS-4340
https://lists.apache.org/thread/7scdf4859scywos4knw44dbdw39wphcb

If Franz doesnt know why he did it, and it doesnt work without Unsafe anyway, I think it might be time we stopped it doing what its doing, or figure out why it did and how to adapt it to not having Unsafe..

Comment on lines +352 to +357

@LogMessage(id = 214037, value = "Unable to check IoUring availability ", level = LogMessage.Level.WARN)
void unableToCheckIoUringAvailability(Throwable e);

@LogMessage(id = 214038, value = "IoUring is not available, please add to the classpath or configure useIoUring=false to remove this warning", level = LogMessage.Level.WARN)
void unableToCheckIoUringAvailabilitynoClass();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could these go in commons? Its annoying when methods like these log and suggest they are client issues, but are in fact being triggered by the broker while its acting as a server to clients. E.g the method right above these ones is especially confusing when logged by the broker.

Comment on lines +57 to +66
try {
return Env.isLinuxOs() && IoUring.isAvailable();
} catch (NoClassDefFoundError noClassDefFoundError) {
ActiveMQClientLogger.LOGGER.unableToCheckIoUringAvailabilitynoClass();
return false;
} catch (Throwable e) {
ActiveMQClientLogger.LOGGER.unableToCheckIoUringAvailability(e);
return false;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also feels like it would make more sense in commons (I know, the previous checks are in here; same comment to them)

Comment on lines +92 to +100
<dependency>
<groupId>io.netty</groupId>
<artifactId>netty-transport-native-io_uring</artifactId>
<classifier>${netty-transport-native-io_uring-classifier}</classifier>
</dependency>
<dependency>
<groupId>io.netty</groupId>
<artifactId>netty-transport-classes-io_uring</artifactId>
</dependency>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I've commented on all the other io_uring PRs, I'd personally prefer folks had to add the deps if they want them, especially given its off by default, and its entirely untested.

Or at least for the client, if not both. The way its availability+use is done would need to change slightly to facilitate it to work without the classes dep though (elsewhere, I use a class per native type, and thats the only one to directly reference the types).

@gemmellr
Copy link
Member

gemmellr commented Nov 5, 2025

Beyond those comments, I was surprised I didnt see any test changes around the change in Netty default for hostname verification in 4.2. Either the Artemis code already handles [enabling and] disabling hostname in the required manner in advance to still work properly with the updated behaviour, which I dont recall seeing it do so, or perhaps it might suggest there is a missing test around explicitly disabling it.

Comment on lines +747 to +770
<profile>
<id>jdk-17-23</id>
<activation>
<jdk>[17, 23]</jdk>
</activation>
<properties>
<activemq-surefire-argline>
${initial-activemq-surefire-argline}
</activemq-surefire-argline>
</properties>
</profile>
<profile>
<id>jdk-25</id>
<activation>
<jdk>[24, 25]</jdk>
</activation>
<properties>
<activemq-surefire-argline>
${initial-activemq-surefire-argline}
--enable-native-access=ALL-UNNAMED
--sun-misc-unsafe-memory-access=allow
</activemq-surefire-argline>
</properties>
</profile>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The jdk25 profile is also set for 24 so should probably mention that in the name.

It may be better to say [24, 26) rather than [24, 25], at least that was always the historic recommendation, but its possible it no longer matters thee days: https://maven.apache.org/guides/introduction/introduction-to-profiles.html#JDK

Nitpick, but the several existing jdk related profiles skip the "-" immediately after "jdk", so e.g "jdk24-25" would align better with them.

https://github.com/apache/activemq-artemis/blob/2.34.0/tests/integration-tests/pom.xml#L476-L484

That also shows the format used elsewhere by having a specific prop for the additions, that is set empty to begin with then referenced in the original base property, and then set to the alternative value in a profile when needed meaning you only need the higher-version profile(s) for the additions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants