Add base session created in SparkConnectService #52895

garlandz-db · 2025-11-05T10:21:57Z

What changes were proposed in this pull request?

This PR makes SparkConnectService rely on its own SparkSession that is private and only intended for copying session configs to create new Sessions

Why are the changes needed?

The default session can get cleaned up in which case the SparkConnectService cannot recover as session creation fails on subsequent rpcs

Does this PR introduce any user-facing change?

No

How was this patch tested?

Added basic testing

Was this patch authored or co-authored using generative AI tooling?

Yes

.../server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectSessionManager.scala

hvanhovell · 2025-11-05T14:11:01Z

.../server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectSessionManager.scala

+   */
+  def initializeBaseSession(sc: SparkContext): Unit = synchronized {
+    if (baseSession.isEmpty) {
+      baseSession = Some(SparkSession.builder().sparkContext(sc).getOrCreate())


Please call newSession() on this session. The session returned by getOrCreate() is either an existing session, or will be accessible to others. It can be tampered with, and we should avoid that.

hvanhovell · 2025-11-05T14:12:15Z

...connect/server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectService.scala

      return
    }

+    sessionManager.initializeBaseSession(sc)


I think this is fine for now, at some point we should consider making the initialisation logic of connect less singleton heavy so we can pass the SparkContext as a constructor parameter.

but session manager is a class object. what difference does it make for SparkConnectService to be a class object too

hvanhovell · 2025-11-05T15:55:46Z

.../server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectSessionManager.scala

+   * Initialize the base SparkSession from the provided SparkContext.
+   * This should be called once during SparkConnectService startup.
+   */
+  def initializeBaseSession(sc: SparkContext): Unit = synchronized {


You can drop synchronized here...

hvanhovell · 2025-11-05T15:57:01Z

.../server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectSessionManager.scala


  private def newIsolatedSession(): SparkSession = {
-    val active = SparkSession.active
-    if (active.sparkContext.isStopped) {


@garlandz-db can you figure out why this branch is here. We may have to recreate the session if this is an actual problem...

the original pr: #43701 by Kent Yao. if the spark context is stopped then active.newSession() would throw an exception

org.apache.spark.SparkException: com.google.common.util.concurrent.UncheckedExecutionException: java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext.

my guess: spark cluster prob isnt useful but technically you can call spark connect apis still. so we can create a valid spark session and continue handling the rpc.

however our fix is tangential to that error. we do not need to use active in this case.

.../server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectSessionManager.scala

…/service/SparkConnectSessionManager.scala

github-actions bot added SQL CONNECT labels Nov 5, 2025

garlandz-db force-pushed the isolated_root_session branch from c2df6a1 to 88e521e Compare November 5, 2025 10:27

hvanhovell reviewed Nov 5, 2025

View reviewed changes

.../server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectSessionManager.scala Show resolved Hide resolved

hvanhovell reviewed Nov 5, 2025

View reviewed changes

Add base session created in SparkConnectService

6f70e4f

garlandz-db force-pushed the isolated_root_session branch from 88e521e to 6f70e4f Compare November 5, 2025 16:26

garlandz-db requested a review from hvanhovell November 5, 2025 16:35

garlandz-db commented Nov 6, 2025

View reviewed changes

.../server/src/main/scala/org/apache/spark/sql/connect/service/SparkConnectSessionManager.scala Outdated Show resolved Hide resolved

Update sql/connect/server/src/main/scala/org/apache/spark/sql/connect…

b526d32

…/service/SparkConnectSessionManager.scala

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add base session created in SparkConnectService #52895

Add base session created in SparkConnectService #52895

garlandz-db commented Nov 5, 2025 •

edited

Loading

Uh oh!

Uh oh!

hvanhovell Nov 5, 2025

Uh oh!

hvanhovell Nov 5, 2025

Uh oh!

garlandz-db Nov 7, 2025

Uh oh!

hvanhovell Nov 5, 2025

Uh oh!

hvanhovell Nov 5, 2025

Uh oh!

garlandz-db Nov 5, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add base session created in SparkConnectService #52895

Are you sure you want to change the base?

Add base session created in SparkConnectService #52895

Conversation

garlandz-db commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

Uh oh!

hvanhovell Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

hvanhovell Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

garlandz-db Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

hvanhovell Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

hvanhovell Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

garlandz-db Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

garlandz-db commented Nov 5, 2025 •

edited

Loading