Create throttled and retrying record iterator #3350

ohadzeliger · 2025-05-08T16:48:03Z

This PR creates the ThrottledRetryingIterator for use with cases that need a reliable retrying mechanism with resource controls.
for the most part, this is new code that is not in use yet. Small changes to the existing code base (log message fields, executor method made public, refactoring of the FutureAutoClose) were made as well.
This class is meant to live at the same level as other FDBDatabaseRunners even though this class does not implement this interface. Hence, this is using the TransactionalRunner and FutureAutoClose classes instead of the (already retrying) FDBDatabaseRunnerImpl.
The iterator implements resource controls - ops/sec and ops/transaction for scan and delete operations. It does not implement similar controls for other operations (update, create etc). These can be added later when they are needed.

Resolves #3355

…adocs

…record-validation-phase-2 # Conflicts: # fdb-record-layer-core/src/main/java/com/apple/foundationdb/record/provider/foundationdb/cursors/throttled/ThrottledRetryingIterator.java # fdb-record-layer-core/src/test/java/com/apple/foundationdb/record/provider/foundationdb/cursors/throttled/ThrottledIteratorTest.java

...ava/com/apple/foundationdb/record/provider/foundationdb/cursors/throttled/CursorFactory.java

...e/foundationdb/record/provider/foundationdb/cursors/throttled/ThrottledRetryingIterator.java

jjezra · 2025-05-15T00:07:18Z

...e/foundationdb/record/provider/foundationdb/cursors/throttled/ThrottledRetryingIterator.java

+        if (maxLimit == 0) {
+            return initialLimit;
+        } else {
+            if (initialLimit == 0) {
+                return maxLimit;
+            } else {
+                return Math.min(initialLimit, maxLimit);
+            }


IIUIC, if the caller does not set these values (default to 0), this iterator runs without a row limit until it hits its first failure. Then it immediately starts throttling (no grace retries).
However, in the scenario of:

Both max, initial are set to zero

Getting a random failure, setting a row limit

At every consequence of 40 successes, increasing the row limit 20%. To infinity and beyond with many log messages.

This scenario would be blocked by the transaction time limit (of course), which makes me think that maybe we should force a non-zero time limit and remove the ability to set a max-limit altogether - which will make the code simpler. If the caller, for whatever reason, really needs a max limit he can implement it at the cursor generator.

Nit - should we rename this function to something that is not a member name? It can be a bit confusing...

At every consequence of 40 successes, increasing the row limit 20%. To infinity and beyond with many log messages

I think this is reasonable behavior: The user wanted an unlimited iteration, but we hit an issue. We set the limit to 90% of the last scanned and try again, striving to get back to unlimited, albeit gradually.
I considered allowing setting of the SUCCESS_INCREASE_THRESHOLD, but that seemed unnecessary quite yet.
The time limit is always set (implicitly to 5 seconds) - I don't know if we should enforce further limits.

The advantage of a default max limit of 0 (IMO) is that this way users do not have to think about it unless there is a reason to. I believe the usage experience is better this way.

rename this function

Done

The advantage of a default max limit of 0 (IMO) is that this way users do not have to think about it unless there is a reason to. I believe the usage experience is better this way

I strongly agree. What I was suggesting is that the equivalent of making max limit 0 mandatory. If the caller really needs to set a max limit he can implement it in the cursor factory. Another reason that I was thinking about eliminating the withMaxRecordsScannedPerTransaction and withInitialRecordsScannedPerTransaction functions from the API is that it is much easier to add them later (if needed) than to deprecate and remove them.
Doing that, we will probably need to verify a non-zero time limit.
@ScottDugas , what do you think?

How about we mark the API EXPERIMENTAL and leave these options in? This way we can make it easier to remove them if they are misused? I just feel that these settings are needed for better throttling controls.

Removed the throttling controls for maxScannedPerTransaction and initialScannedPerTransaction and adjusted tests accordingly.

jjezra · 2025-05-15T00:13:20Z

...e/foundationdb/record/provider/foundationdb/cursors/throttled/ThrottledRetryingIterator.java

+        ++successCounter;
+        if (((successCounter) % SUCCESS_INCREASE_THRESHOLD) == 0 && cursorRowsLimit < (quotaManager.scannedCount + 3)) {
+            final int oldLimit = cursorRowsLimit;
+            cursorRowsLimit = cursorRowsLimit((cursorRowsLimit * 5) / 4, maxRecordScannedPerTransaction);


This is a bug. If the value of cursorRowsLimit is less than 4, it will not be increased.

Suggested change

cursorRowsLimit = cursorRowsLimit((cursorRowsLimit * 5) / 4, maxRecordScannedPerTransaction);

final int increasedLimit = cursorRowsLimit < 16 ? (4 + cursorRowsLimit) : (cursorRowsLimit * 5) / 4;

cursorRowsLimit = cursorRowsLimit(increasedLimit, maxRecordScannedPerTransaction);

Right. Fixed, and added tests to show behavior for both increase and decrease.

...apple/foundationdb/record/provider/foundationdb/cursors/throttled/ThrottledIteratorTest.java

- Added comments to tests - extracted increaLimit and decreaseLimit methods, added tests - Fixed issue with increase limit being stuck under 4

jjezra · 2025-05-15T18:23:00Z

...e/foundationdb/record/provider/foundationdb/cursors/throttled/ThrottledRetryingIterator.java

+        if (current == 0) {
+            return 0;
+        }
+        int newLimit = Math.max((current * 5) / 4, current + 1);


If the count is under 16 (let alone 4), the assumption is that there was a problematic item. After 40 successes, though, we can probably raise a little more aggressively. Note that the default time limit makes sure that it will break only if the last item takes more than a second to process.

Suggested change

int newLimit = Math.max((current * 5) / 4, current + 1);

int newLimit = Math.max((current * 5) / 4, current + 4);

jjezra · 2025-05-15T18:31:04Z

...e/foundationdb/record/provider/foundationdb/cursors/throttled/ThrottledRetryingIterator.java

+        if (maxLimit == 0) {
+            return initialLimit;
+        } else {
+            if (initialLimit == 0) {
+                return maxLimit;
+            } else {
+                return Math.min(initialLimit, maxLimit);
+            }


The advantage of a default max limit of 0 (IMO) is that this way users do not have to think about it unless there is a reason to. I believe the usage experience is better this way

I strongly agree. What I was suggesting is that the equivalent of making max limit 0 mandatory. If the caller really needs to set a max limit he can implement it in the cursor factory. Another reason that I was thinking about eliminating the withMaxRecordsScannedPerTransaction and withInitialRecordsScannedPerTransaction functions from the API is that it is much easier to add them later (if needed) than to deprecate and remove them.
Doing that, we will probably need to verify a non-zero time limit.
@ScottDugas , what do you think?

- Increase increment to 4 - make class EXPERIMENTAL

...ayer-core/src/main/java/com/apple/foundationdb/record/provider/foundationdb/FDBDatabase.java

...ava/com/apple/foundationdb/record/provider/foundationdb/cursors/throttled/CursorFactory.java

...e/foundationdb/record/provider/foundationdb/cursors/throttled/ThrottledRetryingIterator.java

...apple/foundationdb/record/provider/foundationdb/cursors/throttled/ThrottledIteratorTest.java

- added executor to whileTrue call - always clearWeakReadSemantics=true - improve close cursors - catch `RunnerClosed` exception, don't retry

- Remove row limit controls - Fix tests - add deleteLimit to tests - fix delete limit off-by-one bug

jjezra · 2025-05-19T22:46:53Z

...e/foundationdb/record/provider/foundationdb/cursors/throttled/ThrottledRetryingIterator.java

+        private Consumer<QuotaManager> transactionSuccessNotification;
+        private Consumer<QuotaManager> transactionInitNotification;


Nit: should the default null values be explicitly assigned?

jjezra · 2025-05-19T22:58:05Z

...apple/foundationdb/record/provider/foundationdb/cursors/throttled/ThrottledIteratorTest.java

+        if (successNotification != null) {
+            throttledIterator.withTransactionSuccessNotification(successNotification);


I like the way this iteratorBuilder allows testing of the unmodified default values.

...ava/com/apple/foundationdb/record/provider/foundationdb/cursors/throttled/CursorFactory.java

ScottDugas · 2025-05-23T15:55:50Z

...e/foundationdb/record/provider/foundationdb/cursors/throttled/ThrottledRetryingIterator.java

+            if (logger.isWarnEnabled()) {
+                logger.warn(KeyValueLogMessage.of("ThrottledIterator: runner closed: will abort"), ex);
+            }


I'm not sure this is necessary, but shouldn't produce too much noise.

...e/foundationdb/record/provider/foundationdb/cursors/throttled/ThrottledRetryingIterator.java

...apple/foundationdb/record/provider/foundationdb/cursors/throttled/ThrottledIteratorTest.java

...e/foundationdb/record/provider/foundationdb/cursors/throttled/ThrottledRetryingIterator.java

...apple/foundationdb/record/provider/foundationdb/cursors/throttled/ThrottledIteratorTest.java

- Register the onNext future with the future manager - Add tests for cursor closing and onNext future closing - Make most tests start iteration outside of the transaction - Add try-with-resource block for the iterator in tests

ScottDugas · 2025-05-23T22:55:20Z

...apple/foundationdb/record/provider/foundationdb/runners/throttled/ThrottledIteratorTest.java

+    }
+
+    @Test
+    void testThrottleIteratorTransactionTimeLimit() throws Exception {


Based on the prb, it looks like this test is flaky, or broken.

ohadzeliger added 4 commits April 29, 2025 14:53

Initial commit of the throttled iterator and tests

8d3a421

Use interfaces

6c6e60d

Added more tests, fixed a couple of issues in the iterator, added jav…

9f8b4ad

…adocs

Put cursor in try block

e6f19d4

ohadzeliger self-assigned this May 8, 2025

ohadzeliger requested a review from jjezra May 8, 2025 16:48

ohadzeliger added 9 commits May 12, 2025 10:24

Initial commit of the throttled iterator and tests

28836ba

Use interfaces

596c733

Added more tests, fixed a couple of issues in the iterator, added jav…

3a71f5b

…adocs

Put cursor in try block

a983f4e

Fix cursor close issue, add test

ae851ef

Indentation

dc39ff6

Merge branch 'main' into record-validation-phase-2

428d7e3

Use TransactionalRunner, FutureSutoClose, add docs.

8d2bdf8

ohadzeliger changed the title ~~Record validation phase 2~~ Create throttled and retrying record iterator May 14, 2025

ohadzeliger added the enhancement New feature or request label May 14, 2025

ohadzeliger marked this pull request as ready for review May 14, 2025 19:04

jjezra requested changes May 15, 2025

View reviewed changes

PR Comments:

5292b5d

- Added comments to tests - extracted increaLimit and decreaseLimit methods, added tests - Fixed issue with increase limit being stuck under 4

jjezra requested changes May 15, 2025

View reviewed changes

PR Comments:

93df5e5

- Increase increment to 4 - make class EXPERIMENTAL

ScottDugas requested changes May 16, 2025

View reviewed changes

ohadzeliger added 2 commits May 16, 2025 15:33

PR comments.

ec53b90

- added executor to whileTrue call - always clearWeakReadSemantics=true - improve close cursors - catch `RunnerClosed` exception, don't retry

PR comments.

9bc3745

- Remove row limit controls - Fix tests - add deleteLimit to tests - fix delete limit off-by-one bug

jjezra approved these changes May 19, 2025

View reviewed changes

ScottDugas requested changes May 23, 2025

View reviewed changes

ohadzeliger added 3 commits May 23, 2025 18:05

PR comments:

f9e0cf7

- Register the onNext future with the future manager - Add tests for cursor closing and onNext future closing - Make most tests start iteration outside of the transaction - Add try-with-resource block for the iterator in tests

Style check

57ebb8e

PR comments: Moved from the cursors to the runners package.

b6badc7

ScottDugas requested changes May 23, 2025

View reviewed changes

ohadzeliger added 2 commits May 26, 2025 18:39

Change timing to make test less flaky

c606a9c

Change timing to make test less flaky

d1cb374

ScottDugas approved these changes May 27, 2025

View reviewed changes

ScottDugas requested a review from jjezra May 27, 2025 15:23

ohadzeliger merged commit e5a5d44 into FoundationDB:main May 27, 2025
5 checks passed

ohadzeliger deleted the record-validation-phase-2 branch May 27, 2025 15:50

	cursorRowsLimit = cursorRowsLimit((cursorRowsLimit * 5) / 4, maxRecordScannedPerTransaction);
	final int increasedLimit = cursorRowsLimit < 16 ? (4 + cursorRowsLimit) : (cursorRowsLimit * 5) / 4;
	cursorRowsLimit = cursorRowsLimit(increasedLimit, maxRecordScannedPerTransaction);

	int newLimit = Math.max((current * 5) / 4, current + 1);
	int newLimit = Math.max((current * 5) / 4, current + 4);

		private Consumer<QuotaManager> transactionSuccessNotification;
		private Consumer<QuotaManager> transactionInitNotification;

		if (successNotification != null) {
		throttledIterator.withTransactionSuccessNotification(successNotification);

Create throttled and retrying record iterator #3350

Create throttled and retrying record iterator #3350

Uh oh!

Conversation

ohadzeliger commented May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jjezra May 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jjezra May 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jjezra May 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ohadzeliger commented May 8, 2025 •

edited

Loading

jjezra May 15, 2025 •

edited

Loading

jjezra May 15, 2025 •

edited

Loading

jjezra May 15, 2025 •

edited

Loading