You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Druid version 28.0.0 , 28.0.1
Operator version 1.2.1
Using the example of deploying a mm-less cluster, I encountered a problem that some tasks from time to time fall with a failed status. Although according to the logs the task execution ends with the status success.
INFO [task-runner-0-priority-0] org.apache.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: { "id" : "index_kinesis_name_cdeb9f4759f96fd_npcokplb", "status" : "SUCCESS", "duration" : 1740006, "errorMsg" : null, "location" : { "host" : null, "port" : -1, "tlsPort" : -1 } } 2024-01-30T11:24:02,602 INFO [LookupExtractorFactoryContainerProvider-MainThread] org.apache.druid.query.lookup.LookupReferencesManager - Lookup Management loop exited. Lookup notices are not handled anymore. 2024-01-30T11:24:02,608 INFO [main] org.apache.druid.cli.CliPeon - Thread [Thread[Thread-49,5,main]] is non daemon. 2024-01-30T11:24:02,610 INFO [main] org.apache.druid.java.util.common.lifecycle.Lifecycle - Lifecycle [module] already stopped and stop was called. Silently skipping Finished peon task Cannot remove shutdown hook, already shutting down! 2024-01-30T11:24:02,616 INFO [Thread-49] org.apache.druid.query.lookup.LookupReferencesManager - Closed lookup [***]. 2024-01-30T11:24:02,616 INFO [Thread-49] org.apache.druid.query.lookup.LookupReferencesManager - Closed lookup [***]. 2024-01-30T11:24:02,618 INFO [Thread-49] org.apache.druid.query.lookup.LookupReferencesManager - Closed lookup [***]. 2024-01-30T11:24:02,620 INFO [Thread-49] org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider - stopping 2024-01-30T11:24:02,621 INFO [Thread-49] org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatcher - Stopping NodeRoleWatcher for [OVERLORD]... 2024-01-30T11:24:02,623 ERROR [org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatcheroverlord] org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatcher - Expection while watching for NodeRole [OVERLORD]. java.lang.RuntimeException: java.lang.InterruptedException at org.apache.druid.concurrent.LifecycleLock$Sync.awaitStarted(LifecycleLock.java:144) ~[druid-processing-28.0.1.jar:28.0.1] at org.apache.druid.concurrent.LifecycleLock.awaitStarted(LifecycleLock.java:245) ~[druid-processing-28.0.1.jar:28.0.1] at org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatcher.keepWatching(K8sDruidNodeDiscoveryProvider.java:257) ~[?:?] at org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatcher.watch(K8sDruidNodeDiscoveryProvider.java:237) ~[?:?] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) ~[?:?] at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?] at java.lang.Thread.run(Thread.java:840) ~[?:?] Caused by: java.lang.InterruptedException at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1081) ~[?:?] at org.apache.druid.concurrent.LifecycleLock$Sync.awaitStarted(LifecycleLock.java:139) ~[druid-processing-28.0.1.jar:28.0.1] ... 8 more 2024-01-30T11:24:12,625 INFO [org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatcheroverlord] org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatcher - Exited Watch for NodeRole [OVERLORD]. 2024-01-30T11:24:12,625 INFO [Thread-49] org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatcher - Stopped NodeRoleWatcher for [OVERLORD]. 2024-01-30T11:24:12,625 INFO [Thread-49] org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatcher - Stopping NodeRoleWatcher for [COORDINATOR]... 2024-01-30T11:24:14,750 ERROR [org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatchercoordinator] org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatcher - Expection while watching for NodeRole [COORDINATOR]. java.lang.RuntimeException: java.lang.InterruptedException at org.apache.druid.concurrent.LifecycleLock$Sync.awaitStarted(LifecycleLock.java:144) ~[druid-processing-28.0.1.jar:28.0.1] at org.apache.druid.concurrent.LifecycleLock.awaitStarted(LifecycleLock.java:245) ~[druid-processing-28.0.1.jar:28.0.1] at org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatcher.keepWatching(K8sDruidNodeDiscoveryProvider.java:257) ~[?:?] at org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatcher.watch(K8sDruidNodeDiscoveryProvider.java:237) ~[?:?] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) ~[?:?] at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?] at java.lang.Thread.run(Thread.java:840) ~[?:?] Caused by: java.lang.InterruptedException at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1081) ~[?:?] at org.apache.druid.concurrent.LifecycleLock$Sync.awaitStarted(LifecycleLock.java:139) ~[druid-processing-28.0.1.jar:28.0.1] ... 8 more 2024-01-30T11:24:24,751 INFO [org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatchercoordinator] org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatcher - Exited Watch for NodeRole [COORDINATOR]. 2024-01-30T11:24:24,752 INFO [Thread-49] org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatcher - Stopped NodeRoleWatcher for [COORDINATOR]. 2024-01-30T11:24:24,752 INFO [Thread-49] org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider - stopped 2024-01-30T11:24:24,781 INFO [Thread-49] org.apache.druid.java.util.common.lifecycle.Lifecycle$CloseableHandler - Closing object[org.asynchttpclient.DefaultAsyncHttpClient@472de376] 2024-01-30T11:24:24,783 INFO [Thread-49] org.apache.druid.java.util.common.lifecycle.Lifecycle - Stopping lifecycle [module] stage [INIT]
No task reports were found for this task. - however report.json in the ы3 bucket after completing the task
Is there any reason how to fix this?
The text was updated successfully, but these errors were encountered:
@AdheipSingh what is the reason for increasing capacity?
in the settings I already specified druid.indexer.runner.capacity: 12
I can share the druid manifest
Druid version 28.0.0 , 28.0.1
Operator version 1.2.1
Using the example of deploying a mm-less cluster, I encountered a problem that some tasks from time to time fall with a failed status. Although according to the logs the task execution ends with the status success.
INFO [task-runner-0-priority-0] org.apache.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: { "id" : "index_kinesis_name_cdeb9f4759f96fd_npcokplb", "status" : "SUCCESS", "duration" : 1740006, "errorMsg" : null, "location" : { "host" : null, "port" : -1, "tlsPort" : -1 } } 2024-01-30T11:24:02,602 INFO [LookupExtractorFactoryContainerProvider-MainThread] org.apache.druid.query.lookup.LookupReferencesManager - Lookup Management loop exited. Lookup notices are not handled anymore. 2024-01-30T11:24:02,608 INFO [main] org.apache.druid.cli.CliPeon - Thread [Thread[Thread-49,5,main]] is non daemon. 2024-01-30T11:24:02,610 INFO [main] org.apache.druid.java.util.common.lifecycle.Lifecycle - Lifecycle [module] already stopped and stop was called. Silently skipping Finished peon task Cannot remove shutdown hook, already shutting down! 2024-01-30T11:24:02,616 INFO [Thread-49] org.apache.druid.query.lookup.LookupReferencesManager - Closed lookup [***]. 2024-01-30T11:24:02,616 INFO [Thread-49] org.apache.druid.query.lookup.LookupReferencesManager - Closed lookup [***]. 2024-01-30T11:24:02,618 INFO [Thread-49] org.apache.druid.query.lookup.LookupReferencesManager - Closed lookup [***]. 2024-01-30T11:24:02,620 INFO [Thread-49] org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider - stopping 2024-01-30T11:24:02,621 INFO [Thread-49] org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatcher - Stopping NodeRoleWatcher for [OVERLORD]... 2024-01-30T11:24:02,623 ERROR [org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatcheroverlord] org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatcher - Expection while watching for NodeRole [OVERLORD]. java.lang.RuntimeException: java.lang.InterruptedException at org.apache.druid.concurrent.LifecycleLock$Sync.awaitStarted(LifecycleLock.java:144) ~[druid-processing-28.0.1.jar:28.0.1] at org.apache.druid.concurrent.LifecycleLock.awaitStarted(LifecycleLock.java:245) ~[druid-processing-28.0.1.jar:28.0.1] at org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatcher.keepWatching(K8sDruidNodeDiscoveryProvider.java:257) ~[?:?] at org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatcher.watch(K8sDruidNodeDiscoveryProvider.java:237) ~[?:?] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) ~[?:?] at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?] at java.lang.Thread.run(Thread.java:840) ~[?:?] Caused by: java.lang.InterruptedException at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1081) ~[?:?] at org.apache.druid.concurrent.LifecycleLock$Sync.awaitStarted(LifecycleLock.java:139) ~[druid-processing-28.0.1.jar:28.0.1] ... 8 more 2024-01-30T11:24:12,625 INFO [org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatcheroverlord] org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatcher - Exited Watch for NodeRole [OVERLORD]. 2024-01-30T11:24:12,625 INFO [Thread-49] org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatcher - Stopped NodeRoleWatcher for [OVERLORD]. 2024-01-30T11:24:12,625 INFO [Thread-49] org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatcher - Stopping NodeRoleWatcher for [COORDINATOR]... 2024-01-30T11:24:14,750 ERROR [org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatchercoordinator] org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatcher - Expection while watching for NodeRole [COORDINATOR]. java.lang.RuntimeException: java.lang.InterruptedException at org.apache.druid.concurrent.LifecycleLock$Sync.awaitStarted(LifecycleLock.java:144) ~[druid-processing-28.0.1.jar:28.0.1] at org.apache.druid.concurrent.LifecycleLock.awaitStarted(LifecycleLock.java:245) ~[druid-processing-28.0.1.jar:28.0.1] at org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatcher.keepWatching(K8sDruidNodeDiscoveryProvider.java:257) ~[?:?] at org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatcher.watch(K8sDruidNodeDiscoveryProvider.java:237) ~[?:?] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) ~[?:?] at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?] at java.lang.Thread.run(Thread.java:840) ~[?:?] Caused by: java.lang.InterruptedException at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1081) ~[?:?] at org.apache.druid.concurrent.LifecycleLock$Sync.awaitStarted(LifecycleLock.java:139) ~[druid-processing-28.0.1.jar:28.0.1] ... 8 more 2024-01-30T11:24:24,751 INFO [org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatchercoordinator] org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatcher - Exited Watch for NodeRole [COORDINATOR]. 2024-01-30T11:24:24,752 INFO [Thread-49] org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider$NodeRoleWatcher - Stopped NodeRoleWatcher for [COORDINATOR]. 2024-01-30T11:24:24,752 INFO [Thread-49] org.apache.druid.k8s.discovery.K8sDruidNodeDiscoveryProvider - stopped 2024-01-30T11:24:24,781 INFO [Thread-49] org.apache.druid.java.util.common.lifecycle.Lifecycle$CloseableHandler - Closing object[org.asynchttpclient.DefaultAsyncHttpClient@472de376] 2024-01-30T11:24:24,783 INFO [Thread-49] org.apache.druid.java.util.common.lifecycle.Lifecycle - Stopping lifecycle [module] stage [INIT]
No task reports were found for this task.
- howeverreport.json
in the ы3 bucket after completing the taskIs there any reason how to fix this?
The text was updated successfully, but these errors were encountered: