Skip to content

Conversation

@scottmf
Copy link

@scottmf scottmf commented Nov 20, 2025

Added extra cleanup of PagingManagerImpl.stores and PostOfficeImpl.duplicateIDCache kicked off from PostOfficeImpl.reapAddresses()

@jbertram
Copy link
Contributor

I appreciate that the modifications to the reaper resolve the issue, but this is kind of like bailing water out of a boat with a hole in it. I would much rather just plug the hole. In other words, we need to determine where exactly these resources are leaking and fix that.

If you can work up a real reproducer (i.e. not something using mocks), even if it uses standalone brokers, I can take it from there.

It's worth noting that duplicate detection is on by default for cluster-connections and lots of folks are using a cluster. If this was simple to reproduce I imagine this problem would have been reported long ago so my guess is that there's something fairly unique about your use-case.

@jbertram
Copy link
Contributor

In any event, you've got a handful of CheckStyle issues:

Error:  src/main/java/org/apache/activemq/artemis/core/postoffice/impl/PostOfficeImpl.java:[1591,17] (indentation) Indentation: 'try' child has incorrect indentation level 16, expected level should be 15.
Error:  src/main/java/org/apache/activemq/artemis/core/postoffice/impl/PostOfficeImpl.java:[1592,17] (indentation) Indentation: 'try' child has incorrect indentation level 16, expected level should be 15.
Error:  src/main/java/org/apache/activemq/artemis/core/postoffice/impl/PostOfficeImpl.java:[1593,17] (indentation) Indentation: 'try' child has incorrect indentation level 16, expected level should be 15.
Error:  src/main/java/org/apache/activemq/artemis/core/postoffice/impl/PostOfficeImpl.java:[1595,17] (indentation) Indentation: 'catch' child has incorrect indentation level 16, expected level should be 15.
Error:  src/test/java/org/apache/activemq/artemis/core/paging/impl/PagingStoreCleanupTest.java:[32,47] (imports) AvoidStarImport: Using the '.*' form of import should be avoided - org.junit.jupiter.api.Assertions.*.
Error:  src/test/java/org/apache/activemq/artemis/core/paging/impl/PagingStoreCleanupTest.java:[33,34] (imports) AvoidStarImport: Using the '.*' form of import should be avoided - org.mockito.Mockito.*.
Error:  src/test/java/org/apache/activemq/artemis/core/postoffice/impl/DuplicateIDCacheCleanupTest.java:[43,47] (imports) AvoidStarImport: Using the '.*' form of import should be avoided - org.junit.jupiter.api.Assertions.*.
Error:  src/test/java/org/apache/activemq/artemis/core/postoffice/impl/DuplicateIDCacheCleanupTest.java:[44,34] (imports) AvoidStarImport: Using the '.*' form of import should be avoided - org.mockito.Mockito.*.

For what it's worth, you can run this check locally using mvn checkstyle:check or the dev profile.

@scottmf
Copy link
Author

scottmf commented Nov 21, 2025

I appreciate that the modifications to the reaper resolve the issue, but this is kind of like bailing water out of a boat with a hole in it. I would much rather just plug the hole. In other words, we need to determine where exactly these resources are leaking and fix that.

If you can work up a real reproducer (i.e. not something using mocks), even if it uses standalone brokers, I can take it from there.

It's worth noting that duplicate detection is on by default for cluster-connections and lots of folks are using a cluster. If this was simple to reproduce I imagine this problem would have been reported long ago so my guess is that there's something fairly unique about your use-case.

Yes, understood.

Over the past day i've been digging into this actually from another angle. I see that none of the addresses in those caches are auto-deleted as they should be and that is a bit baffling. I am cleaning the BRIDGE caches, but these caches seem like they should also be cleaned up.

While looking into this, i noticed we have a durable queue that is subscribed to the same address that all of these other queues are using - publish/#. I think this could be the core of my problem and the timing makes sense as to why we never encountered this until about 6 months ago as this is just about the time it was added. The queue is added from an external service so it wasn't clear until now.

Could you comment on that?

@scottmf
Copy link
Author

scottmf commented Nov 22, 2025

Just to be uber clear, these are a sample of the queues. The issue is that the temporary queues aren't being cleaned up, and neither are their bridge counterparts. Could it be that fundamentally I've just his an edge case where amq should not clean these up, or is this the bug? (or am i off-track?)

(This is from jmx)
non-durable / temporary queues:

        {
            "messageCount": "0",
            "messagesExpired": "0",
            "routingType": "MULTICAST",
            "address": "publish/#",
            "maxConsumers": "-1",
            "groupRebalance": "false",
            "consumerCount": "1",
            "temporary": "true",
            "groupBuckets": "-1",
            "enabled": "true",
            "autoDelete": "false",
            "scheduledCount": "0",
            "persistedPause": "false",
            "messagesAdded": "9529",
            "internalQueue": "false",
            "delayBeforeDispatch": "-1",
            "groupRebalancePauseDispatch": "false",
            "directDeliver": "false",
            "name": "67e6f22a-273c-44dd-ae82-7ba2a92c2a64",
            "durable": "false",
            "autoCreated": "false",
            "purgeOnNoConsumers": "false",
            "deliveringCount": "0",
            "ringSize": "-1",
            "messagesAcked": "9529",
            "exclusive": "false",
            "messagesKilled": "0",
            "paused": "false",
            "lastValueKey": "",
            "id": "60129556963",
            "filter": "",
            "consumersBeforeDispatch": "0",
            "groupFirstKey": "",
            "user": "brokerSystemUser",
            "lastValue": "false"
        },
        {
            "messageCount": "0",
            "messagesExpired": "0",
            "routingType": "MULTICAST",
            "address": "publish/#",
            "maxConsumers": "-1",
            "groupRebalance": "false",
            "consumerCount": "1",
            "temporary": "true",
            "groupBuckets": "-1",
            "enabled": "true",
            "autoDelete": "false",
            "scheduledCount": "0",
            "persistedPause": "false",
            "messagesAdded": "9529",
            "internalQueue": "false",
            "delayBeforeDispatch": "-1",
            "groupRebalancePauseDispatch": "false",
            "directDeliver": "false",
            "name": "d3c7a0a1-cac3-474f-893b-8d6530571ec7",
            "durable": "false",
            "autoCreated": "false",
            "purgeOnNoConsumers": "false",
            "deliveringCount": "0",
            "ringSize": "-1",
            "messagesAcked": "9529",
            "exclusive": "false",
            "messagesKilled": "0",
            "paused": "false",
            "lastValueKey": "",
            "id": "60129556972",
            "filter": "",
            "consumersBeforeDispatch": "0",
            "groupFirstKey": "",
            "user": "brokerSystemUser",
            "lastValue": "false"
        },

Here is the durable queue (temporary: false)

        {
            "messageCount": "0",
            "messagesExpired": "0",
            "routingType": "MULTICAST",
            "address": "publish/#",
            "maxConsumers": "-1",
            "groupRebalance": "false",
            "consumerCount": "2",
            "temporary": "false",
            "groupBuckets": "-1",
            "enabled": "true",
            "autoDelete": "false",
            "scheduledCount": "0",
            "persistedPause": "false",
            "messagesAdded": "4910",
            "internalQueue": "false",
            "delayBeforeDispatch": "-1",
            "groupRebalancePauseDispatch": "false",
            "directDeliver": "false",
            "name": "ebs-group.publish/#",
            "durable": "true",
            "autoCreated": "false",
            "purgeOnNoConsumers": "false",
            "deliveringCount": "0",
            "ringSize": "-1",
            "messagesAcked": "4910",
            "exclusive": "false",
            "messagesKilled": "0",
            "paused": "false",
            "lastValueKey": "",
            "id": "60129569335",
            "filter": "NOT ((AMQAddress = 'activemq.management') OR (AMQAddress = 'activemq.notifications'))",
            "consumersBeforeDispatch": "0",
            "groupFirstKey": "",
            "user": "user",
            "lastValue": "false"
        },

@tabish121
Copy link
Contributor

I would suggest making this a draft and working through the issue some more until it can be determined how to best address the underlying problem.

@scottmf
Copy link
Author

scottmf commented Nov 26, 2025

I would suggest making this a draft and working through the issue some more until it can be determined how to best address the underlying problem.

Sounds good. I am actively working on reproducing this in a simplified form factor. I will make this a DRAFT until I have something more concrete in place to show the issue.

@scottmf scottmf marked this pull request as draft November 26, 2025 18:12
@scottmf
Copy link
Author

scottmf commented Dec 2, 2025

@jbertram I just pushed an additional update that changes SimpleAddressManager.addressWasUsed(). It fixes the second issue I referenced in the repro. The repro appears to be completely fixed when using this change. All unit tests pass in the build, but it can definitely use some scrutiny.

In poking around the changes in this area, the issue that I'm seeing seems very similar to ARTEMIS-4162..

@scottmf scottmf marked this pull request as ready for review December 2, 2025 06:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants