Skip to content

11622 : OutlierDetection should use Ticker, not TimeProvider #12110

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

vimanikag
Copy link

@vimanikag vimanikag commented May 29, 2025

Description :

TimeProvider provides wall time. That can move forward and backward as time is adjusted. OutlierDetection is measuring durations, so it should use a monotonic clock. We use Ticker (either Guava's or Deadline's, but generally Guava's) for that. FakeClock supports both for testing.

Doing a quick audit, this seems to be the only present incorrect use of TimeProvider.

I feel like I noticed this before, but I don't know why I didn't change it at the time. Maybe I had wanted to change the weird Long usage as well. (E.g., EPOCH = currentTimeNanos() - 1 and use EPOCH instead of null.)

https://github.com/grpc/grpc-java/issues/11622

@vimanikag vimanikag marked this pull request as ready for review May 29, 2025 06:17
Copy link
Member

@ejona86 ejona86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is insufficient. You need to audit all usages of the time and make sure subtraction is used when determining the times. See System.nanoTime's javadoc for explanation.

@ejona86 ejona86 added the kokoro:run Add this label to a PR to tell Kokoro the code is safe and tests can be run label May 29, 2025
@grpc-kokoro grpc-kokoro removed the kokoro:run Add this label to a PR to tell Kokoro the code is safe and tests can be run label May 29, 2025
@vimanikag
Copy link
Author

This is insufficient. You need to audit all usages of the time and make sure subtraction is used when determining the times. See System.nanoTime's javadoc for explanation.

I have fallowed the System.nanoTime's javadoc for the subtraction cases and avoided the direct usage of ticker.read() finding the time difference.

During my audit , I have noticed below classes also using ticker.read() directly in finding the time difference and fallowed the same suggestions for below two classes :

https://github.com/grpc/grpc-java/blob/master/binder/src/main/java/io/grpc/binder/internal/PingTracker.java
https://github.com/grpc/grpc-java/blob/master/netty/src/main/java/io/grpc/netty/AbstractNettyHandler.java

@ejona86
Copy link
Member

ejona86 commented Jun 3, 2025

During my audit , I have noticed below classes also using ticker.read() directly in finding the time difference and fallowed the same suggestions for below two classes :

I don't understand. All you did is make a variable for the ticker.read() call. It was 100% fine before your change, and you didn't change any behavior. Revert those changes.

I'm confident OutlierDetectionLoadBalancer is still broken.

@vimanikag
Copy link
Author

vimanikag commented Jun 4, 2025

During my audit , I have noticed below classes also using ticker.read() directly in finding the time difference and fallowed the same suggestions for below two classes :

I don't understand. All you did is make a variable for the ticker.read() call. It was 100% fine before your change, and you didn't change any behavior. Revert those changes.

I'm confident OutlierDetectionLoadBalancer is still broken.

I am hoping the changes below will fix some edge cases like handling inconsistent time with ticker.read() and avoid -ve elapsedTimeNanos , please let me know Your thoughts on above comments "I'm confident OutlierDetectionLoadBalancer is still broken."

long elapsedTimeNanos = Math.max(0L,currentTickerTimeNanos - detectionTimerStartNanos);

I have added the similar changes in AbstractNettyHandler as elapsed calculation has involved and reverted the PingTracker changes as we are not using a ticker to find elapsed time

I am really not getting Your thoughts on ticker.read() usage and concerns , as per my understanding ticker.read() is used to get System.nanoTime() to find the time difference between the current action using monotonic clock time instead of using wall time.

it will really help to take it forward in this case if You share some more thoughts on "why You feel the OutlierDetectionLoadBalancer is still broken" after using ticker.read().

// If a timer has started earlier we cancel it and use the difference between the start
// time and now as the interval.
initialDelayNanos = Math.max(0L,
config.intervalNanos - (timeProvider.currentTimeNanos() - detectionTimerStartNanos));
config.intervalNanos - elapsedTimeNanos);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can keep everything in-line here and above to make it less confusing. Storing in a variable doesn't change anything here, but sure you need to do Math.max(0L, ...) to not have any negative value there.

@@ -199,7 +199,8 @@ public void updateWindow() throws Http2Exception {
pingReturn++;
setPinging(false);

long elapsedTime = (ticker.read() - lastPingTime);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was fine I believe, this doesn't need any change.

@@ -201,7 +203,7 @@ class DetectionTimer implements Runnable {

@Override
public void run() {
detectionTimerStartNanos = timeProvider.currentTimeNanos();
detectionTimerStartNanos = ticker.read();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you need to audit fully, if somewhere in inner methods it is being used in unwanted way - like it is mentioned in javadoc. The comparison should always be done doing subtraction.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have taken the report using grep commands and verified with the IDE search with ticker.read() but I did not see the scenario where the ticker.read() is directly used in finding the difference without doing the subtraction and We are frequently using the this ticker.read() in junit's but not seen it's used in unwanted way

I can see the ticker.read() in a few below implementation classes but not seen it's used in an unwanted way , please find the attached Audit_TR.txt for Your reference.

Aduit_ticker_read.txt

AbstractNettyHandler.java
NettyServerHandler.java
CachingRlsLbClient.java
LinkedHashLruCache.java
AdaptiveThrottler.java
PingTracker.java
OutlierDetectionLoadBalancer.java

@shivaspeaks
Copy link
Member

Also, please add the description and link the issue it is trying to resolve.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants