Skip to content

Create throttled and retrying record iterator #3350

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
May 27, 2025

Conversation

ohadzeliger
Copy link
Contributor

@ohadzeliger ohadzeliger commented May 8, 2025

This PR creates the ThrottledRetryingIterator for use with cases that need a reliable retrying mechanism with resource controls.
for the most part, this is new code that is not in use yet. Small changes to the existing code base (log message fields, executor method made public, refactoring of the FutureAutoClose) were made as well.
This class is meant to live at the same level as other FDBDatabaseRunners even though this class does not implement this interface. Hence, this is using the TransactionalRunner and FutureAutoClose classes instead of the (already retrying) FDBDatabaseRunnerImpl.
The iterator implements resource controls - ops/sec and ops/transaction for scan and delete operations. It does not implement similar controls for other operations (update, create etc). These can be added later when they are needed.

Resolves #3355

@ohadzeliger ohadzeliger self-assigned this May 8, 2025
@ohadzeliger ohadzeliger requested a review from jjezra May 8, 2025 16:48
…record-validation-phase-2

# Conflicts:
#	fdb-record-layer-core/src/main/java/com/apple/foundationdb/record/provider/foundationdb/cursors/throttled/ThrottledRetryingIterator.java
#	fdb-record-layer-core/src/test/java/com/apple/foundationdb/record/provider/foundationdb/cursors/throttled/ThrottledIteratorTest.java
@ohadzeliger ohadzeliger changed the title Record validation phase 2 Create throttled and retrying record iterator May 14, 2025
@ohadzeliger ohadzeliger added the enhancement New feature or request label May 14, 2025
@ohadzeliger ohadzeliger marked this pull request as ready for review May 14, 2025 19:04
Comment on lines 327 to 334
if (maxLimit == 0) {
return initialLimit;
} else {
if (initialLimit == 0) {
return maxLimit;
} else {
return Math.min(initialLimit, maxLimit);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUIC, if the caller does not set these values (default to 0), this iterator runs without a row limit until it hits its first failure. Then it immediately starts throttling (no grace retries).
However, in the scenario of:

  • Both max, initial are set to zero
  • Getting a random failure, setting a row limit
  • At every consequence of 40 successes, increasing the row limit 20%. To infinity and beyond with many log messages.

This scenario would be blocked by the transaction time limit (of course), which makes me think that maybe we should force a non-zero time limit and remove the ability to set a max-limit altogether - which will make the code simpler. If the caller, for whatever reason, really needs a max limit he can implement it at the cursor generator.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit - should we rename this function to something that is not a member name? It can be a bit confusing...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At every consequence of 40 successes, increasing the row limit 20%. To infinity and beyond with many log messages

I think this is reasonable behavior: The user wanted an unlimited iteration, but we hit an issue. We set the limit to 90% of the last scanned and try again, striving to get back to unlimited, albeit gradually.
I considered allowing setting of the SUCCESS_INCREASE_THRESHOLD, but that seemed unnecessary quite yet.
The time limit is always set (implicitly to 5 seconds) - I don't know if we should enforce further limits.

The advantage of a default max limit of 0 (IMO) is that this way users do not have to think about it unless there is a reason to. I believe the usage experience is better this way.

rename this function

Done

Copy link
Contributor

@jjezra jjezra May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The advantage of a default max limit of 0 (IMO) is that this way users do not have to think about it unless there is a reason to. I believe the usage experience is better this way

I strongly agree. What I was suggesting is that the equivalent of making max limit 0 mandatory. If the caller really needs to set a max limit he can implement it in the cursor factory. Another reason that I was thinking about eliminating the withMaxRecordsScannedPerTransaction and withInitialRecordsScannedPerTransaction functions from the API is that it is much easier to add them later (if needed) than to deprecate and remove them.
Doing that, we will probably need to verify a non-zero time limit.
@ScottDugas , what do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we mark the API EXPERIMENTAL and leave these options in? This way we can make it easier to remove them if they are misused? I just feel that these settings are needed for better throttling controls.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the throttling controls for maxScannedPerTransaction and initialScannedPerTransaction and adjusted tests accordingly.

++successCounter;
if (((successCounter) % SUCCESS_INCREASE_THRESHOLD) == 0 && cursorRowsLimit < (quotaManager.scannedCount + 3)) {
final int oldLimit = cursorRowsLimit;
cursorRowsLimit = cursorRowsLimit((cursorRowsLimit * 5) / 4, maxRecordScannedPerTransaction);
Copy link
Contributor

@jjezra jjezra May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bug. If the value of cursorRowsLimit is less than 4, it will not be increased.

Suggested change
cursorRowsLimit = cursorRowsLimit((cursorRowsLimit * 5) / 4, maxRecordScannedPerTransaction);
final int increasedLimit = cursorRowsLimit < 16 ? (4 + cursorRowsLimit) : (cursorRowsLimit * 5) / 4;
cursorRowsLimit = cursorRowsLimit(increasedLimit, maxRecordScannedPerTransaction);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Fixed, and added tests to show behavior for both increase and decrease.

- Added comments to tests
- extracted increaLimit and decreaseLimit methods, added tests
- Fixed issue with increase limit being stuck under 4
if (current == 0) {
return 0;
}
int newLimit = Math.max((current * 5) / 4, current + 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the count is under 16 (let alone 4), the assumption is that there was a problematic item. After 40 successes, though, we can probably raise a little more aggressively. Note that the default time limit makes sure that it will break only if the last item takes more than a second to process.

Suggested change
int newLimit = Math.max((current * 5) / 4, current + 1);
int newLimit = Math.max((current * 5) / 4, current + 4);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 327 to 334
if (maxLimit == 0) {
return initialLimit;
} else {
if (initialLimit == 0) {
return maxLimit;
} else {
return Math.min(initialLimit, maxLimit);
}
Copy link
Contributor

@jjezra jjezra May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The advantage of a default max limit of 0 (IMO) is that this way users do not have to think about it unless there is a reason to. I believe the usage experience is better this way

I strongly agree. What I was suggesting is that the equivalent of making max limit 0 mandatory. If the caller really needs to set a max limit he can implement it in the cursor factory. Another reason that I was thinking about eliminating the withMaxRecordsScannedPerTransaction and withInitialRecordsScannedPerTransaction functions from the API is that it is much easier to add them later (if needed) than to deprecate and remove them.
Doing that, we will probably need to verify a non-zero time limit.
@ScottDugas , what do you think?

- Increase increment to 4
- make class EXPERIMENTAL
- added executor to whileTrue call
- always clearWeakReadSemantics=true
- improve close cursors
- catch `RunnerClosed` exception, don't retry
- Remove row limit controls
- Fix tests
- add deleteLimit to tests
- fix delete limit off-by-one bug
Comment on lines 419 to 420
private Consumer<QuotaManager> transactionSuccessNotification;
private Consumer<QuotaManager> transactionInitNotification;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: should the default null values be explicitly assigned?

Comment on lines 584 to 585
if (successNotification != null) {
throttledIterator.withTransactionSuccessNotification(successNotification);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the way this iteratorBuilder allows testing of the unmodified default values.

Comment on lines 284 to 286
if (logger.isWarnEnabled()) {
logger.warn(KeyValueLogMessage.of("ThrottledIterator: runner closed: will abort"), ex);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is necessary, but shouldn't produce too much noise.

- Register the onNext future with the future manager
- Add tests for cursor closing and onNext future closing
- Make most tests start iteration outside of the transaction
- Add try-with-resource block for the iterator in tests
}

@Test
void testThrottleIteratorTransactionTimeLimit() throws Exception {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the prb, it looks like this test is flaky, or broken.

@ScottDugas ScottDugas requested a review from jjezra May 27, 2025 15:23
@ohadzeliger ohadzeliger merged commit e5a5d44 into FoundationDB:main May 27, 2025
5 checks passed
@ohadzeliger ohadzeliger deleted the record-validation-phase-2 branch May 27, 2025 15:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create an iterator that can retry and throttle
3 participants