_controllerOpt getting shared across resources

### Describe the bug
We were debugging a weird issue where multiple HelixTaskExecutor-message_handle_thread threads were trying to take the lock on [_controllerOpt](https://github.com/apache/helix/blob/1e141250faff877aa3f80c55fa683db041c43892/helix-core/src/main/java/org/apache/helix/participant/DistClusterControllerStateModel.java#L40). 


Codewise [createNewStateModel()](https://github.com/apache/helix/blob/master/helix-core/src/main/java/org/apache/helix/participant/DistClusterControllerStateModelFactory.java#L43) creates a new DistClusterControllerStateModel instance for each (resourceName, partitionKey) pair. 

Earlier we thought maybe all those threads are for same resourceName, partitionKey pair, but trying to handle different state transitions(maybe stale or whatever), but this was ruled out when we checked that these threads were processing the ST for different Clusters( i.e. resource for a SuperCluster/CONTROLLER_CLUSTER)
Example for Thread 0, 35 and 37

```
2025/06/25 09:37:13.340 INFO [HelixStateTransitionHandler] [HelixTaskExecutor-message_handle_thread_0] [helix] [] handling message: 9591d4c7-b08a-466e-bf21-2cfa95d94896 transit gobblin-ddm-kfketl2-test-ltx1-holdem-test-mho.gobblin-ddm-kfketl2-test-ltx1-holdem-test-mho|[] from:OFFLINE to:DROPPED, relayedFrom: null
2025/06/25 09:37:13.342 INFO [HelixStateTransitionHandler] [HelixTaskExecutor-message_handle_thread_0] [helix] [] Instance ltx1-app0932.xyz.com_12923, partition gobblin-ddm-kfketl2-test-ltx1-holdem-test-mho received state transition from OFFLINE to DROPPED on session 1100800c49e63bb4, message id: 9591d4c7-b08a-466e-bf21-2cfa95d94896
```

```
2025/06/25 09:36:51.119 INFO [HelixStateTransitionHandler] [HelixTaskExecutor-message_handle_thread_35] [helix] [] handling message: 0fae1c0d-129a-47c2-a8fd-2c2d00f0d7ca transit gobblin-kafka-streaming-service-ltx1-holdem-test-ppc.gobblin-kafka-streaming-service-ltx1-holdem-test-ppc|[] from:STANDBY to:LEADER, relayedFrom: null
2025/06/25 09:36:51.120 INFO [HelixStateTransitionHandler] [HelixTaskExecutor-message_handle_thread_35] [helix] [] Instance ltx1-app0932.xyz.com_12923, partition gobblin-kafka-streaming-service-ltx1-holdem-test-ppc received state transition from STANDBY to LEADER on session 1100800c49e63bb4, message id: 0fae1c0d-129a-47c2-a8fd-2c2d00f0d7ca
2025/06/25 09:36:51.120 INFO [DistClusterControllerStateModel] [HelixTaskExecutor-message_handle_thread_35] [helix] [] ltx1-app0932.xyz.com_12923 becoming leader from standby for gobblin-kafka-streaming-service-ltx1-holdem-test-ppc
```

```
2025/06/25 09:36:51.408 INFO [HelixStateTransitionHandler] [HelixTaskExecutor-message_handle_thread_37] [helix] [] handling message: 03c3a0e9-40e9-4169-8ec5-4c25d3c0f290 transit gobblin-kafka-streaming-tracking-ltx1-holdem-medvol-localConsumption.gobblin-kafka-streaming-tracking-ltx1-holdem-medvol-localConsumption|[] from:STANDBY to:LEADER, relayedFrom: null
2025/06/25 09:36:51.409 INFO [HelixStateTransitionHandler] [HelixTaskExecutor-message_handle_thread_37] [helix] [] Instance ltx1-app0932.xyz.com_12923, partition gobblin-kafka-streaming-tracking-ltx1-holdem-medvol-localConsumption received state transition from STANDBY to LEADER on session 1100800c49e63bb4, message id: 03c3a0e9-40e9-4169-8ec5-4c25d3c0f290
2025/06/25 09:36:51.409 INFO [DistClusterControllerStateModel] [HelixTaskExecutor-message_handle_thread_37] [helix] [] ltx1-app0932.xyz.com_12923 becoming leader from standby for gobblin-kafka-streaming-tracking-ltx1-holdem-medvol-localConsumption
```

Turns out there is a bug in DistClusterControllerStateModel where it creates _controllerOpt as Optional.empty(). 
Even though we are creating different objects of DistClusterControllerStateModel, since its using Optional.empty() the reference will be same across all objects (Doc for Optional.empty() [here](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/Optional.html#empty())) .
We simualted this through a test and verified the behavior.

### To Reproduce
Create two objects of DistClusterControllerStateModel and compare _controllerOpt for both, it returns true.

### Expected behavior
The lock should be at per (resource, partition) pair level and not across them.

### Additional context
Thread dump: 

[threadDumpST.txt](https://github.com/user-attachments/files/20949445/threadDumpST.txt)

<img width="669" alt="Image" src="https://github.com/user-attachments/assets/9b4e8f55-dd45-436d-b116-12106183a9ca" />

Add any other context about the problem here.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

_controllerOpt getting shared across resources #3050

Describe the bug

To Reproduce

Expected behavior

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

_controllerOpt getting shared across resources #3050

Description

Describe the bug

To Reproduce

Expected behavior

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions