Skip to content

JDBC: Optimize writeEntity calls #1496

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 30, 2025

Conversation

singhpk234
Copy link
Collaborator

@singhpk234 singhpk234 commented Apr 29, 2025

About the change

Optimize writeEntity calls by extracting the calls out of transaction. please ref to benchmarks below for the motivation for the change.

Note : Handling transaction failures and retries are not in the scope of this pr. This is purely for optimization purpose.


Running Branchmark :

started the jdbc implementation backed Service : https://github.com/apache/polaris/pull/1470/files#diff-44e2226f0bc5712cf1cd511efd94242a08c9760b947809c7ebe76f121f480c76

Updated creds via

--- a/benchmarks/src/gatling/resources/benchmark-defaults.conf
+++ b/benchmarks/src/gatling/resources/benchmark-defaults.conf
@@ -28,11 +28,11 @@ http {
 auth {
   # OAuth2 client ID for authentication
   # Required: Must be provided in configuration
-  client-id = null
+  client-id = root

   # OAuth2 client secret for authentication
   # Required: Must be provided in configuration
-  client-secret = null
+  client-secret = s3cr3t
 }

go to benchmark tools and run the following benchmarks :

# Dataset creation
./gradlew gatlingRun --simulation org.apache.polaris.benchmarks.simulations.CreateTreeDataset

# Read/Update operations
./gradlew gatlingRun --simulation org.apache.polaris.benchmarks.simulations.ReadUpdateTreeDataset

# Read-only operations
./gradlew gatlingRun --simulation org.apache.polaris.benchmarks.simulations.ReadTreeDataset

withRunInTransaction

========================================================================================================================
---- Global Information -------------------------------------------------------------|---Total---|-----OK----|----KO----
> request count                                                                      |    30,108 |    30,092 |        16
> min response time (ms)                                                             |         1 |         1 |         7
> max response time (ms)                                                             |       603 |       603 |       243
> mean response time (ms)                                                            |        17 |        17 |        99
> response time std deviation (ms)                                                   |        23 |        23 |        74
> response time 25th percentile (ms)                                                 |         6 |         6 |        44
> response time 50th percentile (ms)                                                 |        11 |        11 |        76
> response time 75th percentile (ms)                                                 |        22 |        22 |       171
> response time 99th percentile (ms)                                                 |        96 |        94 |       243
> mean throughput (rps)                                                              |      83.4 |     83.36 |      0.04
---- Response Time Distribution ----------------------------------------------------------------------------------------
> OK: t < 800 ms                                                                                         30,092 (99.95%)
> OK: 800 ms <= t < 1200 ms                                                                                   0     (0%)
> OK: t >= 1200 ms                                                                                            0     (0%)
> KO                                                                                                         16  (0.05%)
---- Errors ------------------------------------------------------------------------------------------------------------
> status.find.is(200), but actually found 500                                                                15 (93.75%)
> status.find.is(204), but actually found 500                                                                 1  (6.25%)
========================================================================================================================

without runInTransaction

========================================================================================================================
---- Global Information -------------------------------------------------------------|---Total---|-----OK----|----KO----
> request count                                                                      |    30,009 |    30,001 |         8
> min response time (ms)                                                             |         1 |         1 |        10
> max response time (ms)                                                             |       293 |       293 |        89
> mean response time (ms)                                                            |        16 |        16 |        33
> response time std deviation (ms)                                                   |        16 |        16 |        27
> response time 25th percentile (ms)                                                 |         6 |         6 |        16
> response time 50th percentile (ms)                                                 |        12 |        12 |        20
> response time 75th percentile (ms)                                                 |        23 |        23 |        60
> response time 99th percentile (ms)                                                 |        77 |        76 |        89
> mean throughput (rps)                                                              |     83.13 |     83.11 |      0.02
---- Response Time Distribution ----------------------------------------------------------------------------------------
> OK: t < 800 ms                                                                                         30,001 (99.97%)
> OK: 800 ms <= t < 1200 ms                                                                                   0     (0%)
> OK: t >= 1200 ms                                                                                            0     (0%)
> KO                                                                                                          8  (0.03%)
---- Errors ------------------------------------------------------------------------------------------------------------
> status.find.is(200), but actually found 500                                                                 6    (75%)
> status.find.is(204), but actually found 500                                                                 2    (25%)
========================================================================================================================

@dimas-b
Copy link
Contributor

dimas-b commented Apr 29, 2025

@singhpk234 : could you give a bit more details on the error that prompted this PR?

flyrain
flyrain previously approved these changes Apr 29, 2025
Copy link
Contributor

@flyrain flyrain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The number is reasonable and solid. The error rate dropped and overall response time dropped as well. Thanks @singhpk234 !

@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Apr 29, 2025
@singhpk234
Copy link
Collaborator Author

@dimas-b here is the logs for he same :

2025-04-29 20:35:59.540 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2025-04-29 20:35:59.545 UTC [63] LOG:  database system was shut down at 2025-04-29 20:35:59 UTC
2025-04-29 20:35:59.559 UTC [1] LOG:  database system is ready to accept connections
2025-04-29 20:39:14.685 UTC [175] ERROR:  could not serialize access due to read/write dependencies among transactions
2025-04-29 20:39:14.685 UTC [175] DETAIL:  Reason code: Canceled on conflict out to pivot 894, during read.
2025-04-29 20:39:14.685 UTC [175] HINT:  The transaction might succeed if retried.
2025-04-29 20:39:14.685 UTC [175] STATEMENT:  SELECT entity_version, to_purge_timestamp, internal_properties, catalog_id, purge_timestamp, sub_type_code, create_timestamp, last_update_timestamp, parent_id, name, id, drop_timestamp, properties, grant_records_version, type_code FROM POLARIS_SCHEMA.ENTITIES WHERE (catalog_id, id) IN ((0, 8213068413194098796),(0, 2285716631442680836),(0, 3182916641487612758),(3182916641487612758, 3957433893749870399),(0, 0),(3182916641487612758, 6835667964051947808)) AND realm_id = 'POLARIS'
2025-04-29 20:39:14.688 UTC [84] ERROR:  could not serialize access due to read/write dependencies among transactions
2025-04-29 20:39:14.688 UTC [84] DETAIL:  Reason code: Canceled on identification as a pivot, during commit attempt.
2025-04-29 20:39:14.688 UTC [84] HINT:  The transaction might succeed if retried.
2025-04-29 20:39:14.688 UTC [84] STATEMENT:  COMMIT
2025-04-29 20:39:14.869 UTC [173] ERROR:  could not serialize access due to read/write dependencies among transactions

flyrain
flyrain previously approved these changes Apr 29, 2025
Copy link
Contributor

@flyrain flyrain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on handle retry in another PR.

@singhpk234 singhpk234 changed the title JDBC: Remove transaction from atomic writes JDBC: Optimize writeEntity calls Apr 30, 2025
@singhpk234 singhpk234 force-pushed the feature/remove-transaction branch from e6a4e89 to 046233f Compare April 30, 2025 16:52
@@ -143,12 +138,12 @@ private void persistEntity(
@Nonnull PolarisCallContext callCtx,
@Nonnull PolarisBaseEntity entity,
PolarisBaseEntity originalEntity,
Statement statement)
CheckedFunction<String, Integer> fn)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need such a generic type here? Could this be updateFunction?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting making this always String, Integer ? or rename fn to updateFunction, I didn't made the name explicit updateFunction as we do both insert and update :P all DMLs. is kind of done statement.exceuteUpdate

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I didn't realize it's used for the different functions in Statement. Fundamentally I guess it's something like:

public class QueryAction {
  /** Returns the number of affected rows */
  public int apply(String statement);
}

But using a @FunctionalInterface makes sense. For the name, maybe we can just call it as queryConsumer or queryAction or something to make it clear

Comment on lines 23 to 26
@FunctionalInterface
interface CheckedFunction<T, R> {
R apply(T t) throws SQLException;
}
Copy link
Contributor

@flyrain flyrain Apr 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, I don't think it improves the code. It feels like a case of overengineering. Personally, I still prefer the if/else approach, it's simple, clear, and gets the job done. I'm okay with this though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like T is always String and R is always int?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to @eric-maynard ' comment. Also, this interface can probably be an inner class in JdbcBasePersistenceImpl... it is not meant for sharing, is it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted this functional interface to be used across, in case such more use case comes hence i kept generics and moved its to its own class, if we want to address this for this particular use case then defining it within the JdbcBasePersistenceImpl makes sense and removing generic also make sense.

dimas-b
dimas-b previously approved these changes Apr 30, 2025
Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extracting the calls out of transaction (in the description) sounds confusing to me. As far as I can tell all DML in Postgres runs in a transaction. This PR merely changes the method of executing that transaction over the JDBC API.

Comment on lines 23 to 26
@FunctionalInterface
interface CheckedFunction<T, R> {
R apply(T t) throws SQLException;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to @eric-maynard ' comment. Also, this interface can probably be an inner class in JdbcBasePersistenceImpl... it is not meant for sharing, is it?

@singhpk234 singhpk234 force-pushed the feature/remove-transaction branch 2 times, most recently from 4f6b1cc to 4f105bd Compare April 30, 2025 21:15
eric-maynard
eric-maynard previously approved these changes Apr 30, 2025
Copy link
Contributor

@eric-maynard eric-maynard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, much simpler now

@eric-maynard eric-maynard merged commit d3b24d2 into apache:main Apr 30, 2025
6 checks passed
@github-project-automation github-project-automation bot moved this from Ready to merge to Done in Basic Kanban Board Apr 30, 2025
@dimas-b
Copy link
Contributor

dimas-b commented Apr 30, 2025

@singhpk234 : could you re-run the perf. tests on the final state of this PR and re-compare with main before this change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants