-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Provide a mutation count estimation and document how mutations are calculated #12670
Description
Is your feature request related to a problem? Please describe.
The https://docs.cloud.google.com/spanner/quotas only describes how secondary indexes affect mutation counts for delete operations. For inserts and updates, the docs state that "operations count with the multiplicity of the number of columns they affect" but do not describe how secondary indexes contribute to the count.
The full formula for inserts (number of columns + sum of columns across all secondary indexes) was only confirmed informally by Google's backend team in googleapis/google-cloud-go#1721.
Additionally, when creating a new index on an existing table, we observe an immediate increase in mutation counts for writes to that table during the schema change (write-only phase), before the index is fully backfilled. This behavior and its impact on mutation budgets is not documented.
This forces teams to reverse-engineer mutation counting logic, which is fragile and breaks when the internal counting rules change.
Describe the solution you'd like
- A utility class in the Java client (e.g. MutationCountEstimator) that can calculate the expected mutation count for a given set of mutations before committing
OR - Complete documentation of how mutations are actually calculated for all operation types (INSERT, UPDATE, INSERT_OR_UPDATE, REPLACE, DELETE),
including:
- Indexes with STORING clauses
- Computed/generated columns
- The impact on mutation counts during index creation (write-only phase before backfill completes)
Ideally both :D
Describe alternatives you've considered
- Reverse-engineering the count from schema metadata and secondary index definitions. This is error-prone and has caused production incidents when our
calculation diverged from Spanner's actual count. - Committing a single row first, reading getMutationCount() from CommitResponse, then using that to size remaining batches.
- Using a limit well below 80,000 (e.g. 75,000) to absorb miscalculations.
Additional context
- https://cloud.google.com/blog/products/databases/cloud-spanner-doubles-the-number-of-updates-per-transaction — references mutation counting but does not cover the index impact.
- spanner: Programmatically Calculating Spanner Max Mutation Set google-cloud-go#1721 — open since 2019, requesting programmatic mutation calculation. The formula was confirmed informally but never made it into the docs.
- The https://docs.cloud.google.com/spanner/docs/commit-statistics provides post-commit mutation counts but not pre-commit estimation.