Support aqe #1001

abbywh · 2025-06-08T14:39:08Z

Summary

Testing out AQE over manual optimizing from the spark 2 era. This has proven somewhat ineffective at Netflix where we can have a wide variety of shuffle partition sizes through one Chronon job and had trouble tuning coalesce parameter. There were a few key dataframe/sql operation rewrites as well.

As a side benefit, this made tests much faster.

Why / Goal

Get faster performance and have an easier time tuning coalesce.

Test Plan

Added Unit Tests
[ x ] Covered by existing CI
[ x ] Integration tested

Reviewers

Co-authored-by: Pengyu Hou <[email protected]> Signed-off-by: Abby Whittier <[email protected]>

abbywh · 2025-06-08T15:42:56Z

Spark tests down from 25->15 minutes, that's a 40% reduction 🤯

pengyu-hou · 2025-06-10T20:38:49Z

Spark tests down from 25->15 minutes, that's a 40% reduction 🤯

That is awesome!!

pengyu-hou

Minor comment, it looks good overall. Thanks Abby!

pengyu-hou · 2025-06-10T20:42:29Z

spark/src/main/scala/ai/chronon/spark/TableUtils.scala

-      df
+      val df = sparkSession.sql(query)
+      // if aqe auto coalesce is disabled, apply manual coalesce
+      val finalDf = if (!sparkSession.conf.get("spark.sql.adaptive.coalescePartitions.enabled", "true").toBoolean) {


should we use a constant or a variable to save the result of sparkSession.conf.get("spark.sql.adaptive.coalescePartitions.enabled", "true").toBoolean?

looks like there is already useAqeRoute, we could use something similar to refactor

yeah that's a good idea

We could overwrite it for some testing too. I still want to validate that this does something useful in a production environment generally before more investment though

jasonzou0 · 2025-07-24T15:24:55Z

spark/src/main/scala/ai/chronon/spark/TableUtils.scala


  def sql(query: String): DataFrame = {
    val partitionCount = sparkSession.sparkContext.getConf.getInt("spark.default.parallelism", 1000)
+    val autoCoalesceEnabled = sparkSession.conf.get("spark.sql.adaptive.coalescePartitions.enabled", "true").toBoolean


shall we default this to false if it is not set to match to the existing default behavior?

I think the idea is to change the default behavior since it's much more performant in most cases

jasonzou0 · 2025-07-24T15:26:53Z

spark/src/main/scala/ai/chronon/spark/TableUtils.scala

-
-    logger.info(
-      s"\n----[Running query coalesced into at most $partitionCount partitions]----\n$query\n----[End of Query]----\n\n Query call path (not an error stack trace): \n$stackTraceStringPretty \n\n --------")
+    if (!autoCoalesceEnabled) {


[nit] swap the if else branches to do the if enabled case first; i find code easier to read without explicit negations:)

Same with the val finalDf if-else below.

This is a common style in this code base, ex: if (!tableExists(tableName)) return Seq.empty[String] which I chose to follow (I agree it's a bit wonky)

jasonzou0 · 2025-07-24T15:30:19Z

spark/src/main/scala/ai/chronon/spark/TableUtils.scala

-        (df.count(), 1)
-      }
+    val useAqeRoute = sparkSession.conf.getOption("spark.sql.adaptive.enabled").contains("true") &&
+      sparkSession.conf.getOption("spark.sql.adaptive.coalescePartitions.enabled").contains("true")


Is it intentional "is AQE on" logic here check for both config options while the logic in def sql() above only checks for spark.sql.adaptive.coalescePartitions.enabled?

It also might be a good idea to have a common helper function to abstract away the "is AQE on" logic.

jasonzou0 · 2025-07-24T15:43:35Z

spark/src/main/scala/ai/chronon/spark/TableUtils.scala

                                          stats: Option[DfStats],
                                          sortByCols: Seq[String] = Seq.empty,
                                          partitionCols: Seq[String] = Seq.empty): Unit = {
-    // get row count and table partition count statistics


is it possible to restructure this code to minimize diff and thus reviewer mental load to keep track of things? Can we do something like :

if (useAqeRoute) { write_into_df_the_simple_way() return } The_rest_of_the_non_aqe_code_as_is()

So the The_rest_of_the_non_aqe_code_as_is() code still remains on the same indentation level as is and wouldn't show up as diff in the PR.

We eventually branch into the same code below, not sure how to make this cleaner while keeping that. I definitely emphasize that this diff rendered weirdly

jasonzou0 · 2025-07-24T15:45:32Z

spark/src/main/scala/ai/chronon/spark/ChrononKryoRegistrator.scala

      "org.apache.spark.sql.catalyst.InternalRow$$anonfun$getAccessor$5",
      "scala.collection.immutable.ArraySeq$ofRef",
-      "org.apache.spark.sql.catalyst.expressions.GenericInternalRow"
+      "org.apache.spark.sql.catalyst.expressions.GenericInternalRow",


add an inline comment that these registrations are needed for AQE (that's just my guess on why they are here:))?

It's actually more generic than that, AQE uses it, but some of the SQL expressions will output this too. I think this matches the current file organization.

A separate PR/github issues could be cloning this and adding more explanations to the class
https://github.com/apache/spark/blob/dc687d4c83b877e90c8dc03fb88f13440d4ae911/core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala#L575

jasonzou0 · 2025-07-24T15:48:36Z

spark/src/main/scala/ai/chronon/spark/TableUtils.scala

-      val shuffleParallelism = Math.max(dailyFileCount * nonZeroTablePartitionCount, minWriteShuffleParallelism)
-      val saltCol = "random_partition_salt"
-      val saltedDf = df.withColumn(saltCol, round(rand() * (dailyFileCount + 1)))
+      val sortedDf = df.sortWithinPartitions(sortByCols.map(col).toSeq: _*)


why are we ignoring the partitionCols that users may pass in?

Good question, this is kind of a weird one, but iirc, and it's been awhile, including them actually breaks the table schema assumptions and fails the test.

We are essentially relying on AQE as well as the writer plugin to do our repartitioning into the correct partitions. The catalyst engine rewrites this entire section to do the repartition before the sort iirc, making us know the partition columns are all equal to each other, and therefore we don't sort them here. Including it changes the plan in such a way that it does not match the previous chronon plan. Maybe this is a catalyst bug or something, but I'm pretty sure I ran into the same issue as stripe where the results would sometimes be out of order until I removed that.

Abby Whittier and others added 30 commits May 13, 2025 15:42

staging changes for testing iceberg

00d4da8

timeboxing test changes

93b7c92

bootstrapping spark test, 1/3 working on FormatTest

2d4ed6f

got most tests working except droppartitions and the new dateint test

878f64a

cleaning some local changes

4739e93

reverting some local changes

65c81a1

formatting

cab2c6a

more silly local hacks

f4bfb70

Merge branch 'main' into iceberg_unit_tests

ac505d3

fixed the constant derby flake

ae2088f

refactored to match deltalake

dee01ab

added Iceberg Kryo Serializer

69cd50d

scalafmt

5526abe

iceberg circleci integration

2b9c246

fixing typo

b423f4b

giving circleci a dependency

9e5eaae

removing env file

558adc3

moving integration test to spark_embedded

c1eb8a2

figured out why delta lake was on 2.13, need it for spark 3.2

510266c

typo

58a58ff

scalafmt

6ea4121

skipping the flink parts since it doesn't compile to 2.13.6

cb012a6

including TableUtilsTest as well in CI

117c4f7

sperating table utils and format for seperate jvms

7d3272f

typo

f009f97

corrected behavior for long partitions

02c1463

Merge branch 'main' into iceberg_unit_tests

b05ef1a

eventeventlongds test, more kryo registration

059e16e

iceberg drop partitions

907d2f8

long partition testing

0065406

abbywh and others added 13 commits May 17, 2025 09:24

Merge branch 'main' into iceberg_unit_tests

a956c7f

unskipping fixed tests

e980fa1

changing test schema

4054892

updating drop partitions to be schemaless

2ffd32d

found bug during CI testing

50445f0

Apply suggestions from code review

589eba9

Co-authored-by: Pengyu Hou <[email protected]> Signed-off-by: Abby Whittier <[email protected]>

formatting

8cfd661

propping name refactor

c72b74f

fixing some typos

d13dc5e

initial commit for AQE support

c1c8b3d

Merge branch 'main' into support_aqe

38591b7

i really need to add a scalafmt githook

6c25117

updated kryo serializer

0c6a03f

Merge branch 'main' into support_aqe

a5a6716

pengyu-hou reviewed Jun 10, 2025

View reviewed changes

abbywh and others added 3 commits July 19, 2025 11:07

Merge branch 'main' into support_aqe

3091ef5

simplifying the empty check

26196b7

small refactor

ce7a16f

abbywh marked this pull request as ready for review July 21, 2025 21:53

jasonzou0 reviewed Jul 24, 2025

View reviewed changes

Support aqe #1001

Are you sure you want to change the base?

Support aqe #1001

Uh oh!

Conversation

abbywh commented Jun 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why / Goal

Test Plan

Reviewers

Uh oh!

abbywh commented Jun 8, 2025

Uh oh!

pengyu-hou commented Jun 10, 2025

Uh oh!

pengyu-hou left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

abbywh commented Jun 8, 2025 •

edited

Loading