[SPARK-54112][CONNECT] Support getSchemas for SparkConnectDatabaseMetaData #52819

pan3793 · 2025-10-31T17:43:21Z

What changes were proposed in this pull request?

Implement getSchemas methods defined in java.sql.DatabaseMetaData for SparkConnectDatabaseMetaData.

    /**
     * Retrieves the schema names available in this database.  The results
     * are ordered by {@code TABLE_CATALOG} and
     * {@code TABLE_SCHEM}.
     *
     * <P>The schema columns are:
     *  <OL>
     *  <LI><B>TABLE_SCHEM</B> String {@code =>} schema name
     *  <LI><B>TABLE_CATALOG</B> String {@code =>} catalog name (may be {@code null})
     *  </OL>
     *
     * @return a {@code ResultSet} object in which each row is a
     *         schema description
     * @throws SQLException if a database access error occurs
     *
     */
    ResultSet getSchemas() throws SQLException;

    /**
     * Retrieves the schema names available in this database.  The results
     * are ordered by {@code TABLE_CATALOG} and
     * {@code TABLE_SCHEM}.
     *
     * <P>The schema columns are:
     *  <OL>
     *  <LI><B>TABLE_SCHEM</B> String {@code =>} schema name
     *  <LI><B>TABLE_CATALOG</B> String {@code =>} catalog name (may be {@code null})
     *  </OL>
     *
     *
     * @param catalog a catalog name; must match the catalog name as it is stored
     * in the database;"" retrieves those without a catalog; null means catalog
     * name should not be used to narrow down the search.
     * @param schemaPattern a schema name; must match the schema name as it is
     * stored in the database; null means
     * schema name should not be used to narrow down the search.
     * @return a {@code ResultSet} object in which each row is a
     *         schema description
     * @throws SQLException if a database access error occurs
     * @see #getSearchStringEscape
     * @since 1.6
     */
    ResultSet getSchemas(String catalog, String schemaPattern) throws SQLException;

Why are the changes needed?

Enhance API coverage of the Connect JDBC driver, for example, get[Catalogs|Schemas|Tables|...] APIs are used by SQL GUI tools such as DBeaver for displaying the tree category.

Does this PR introduce any user-facing change?

No, Connect JDBC driver is a new feature under development.

How was this patch tested?

New UT is added, also tested via DBeaver - the catalog/schema tree works now.

Was this patch authored or co-authored using generative AI tooling?

No.

pan3793 · 2025-10-31T17:46:03Z

...c/src/main/scala/org/apache/spark/sql/connect/client/jdbc/SparkConnectDatabaseMetaData.scala

  override def dataDefinitionIgnoredInTransactions: Boolean = false

+  private def isNullOrWildcard(pattern: String): Boolean =
+    pattern == null || pattern == "%"


This is used to test whether fooPattern matches ALL

https://docs.oracle.com/en/java/javase/17/docs/api/java.sql/java/sql/DatabaseMetaData.html

Some DatabaseMetaData methods take arguments that are String patterns. These arguments all have names such as fooPattern. Within a pattern String, "%" means match any substring of 0 or more characters, and "_" means match any one character. Only metadata entries matching the search pattern are returned. If a search pattern argument is set to null, that argument's criterion will be dropped from the search.

pan3793 · 2025-11-03T06:47:38Z

this is ready for review, cc @LuciferYang

...c/src/main/scala/org/apache/spark/sql/connect/client/jdbc/SparkConnectDatabaseMetaData.scala

dongjoon-hyun · 2025-11-08T03:56:38Z

Gentle ping, @pan3793 .

…aData

pan3793 · 2025-11-10T12:34:22Z

...c/src/main/scala/org/apache/spark/sql/connect/client/jdbc/SparkConnectDatabaseMetaData.scala


-  override def getSearchStringEscape: String =
-    throw new SQLFeatureNotSupportedException
+  override def getSearchStringEscape: String = "\\"


Spark SQL uses backslash as the default escape char for LIKE expression, it also supports custom escape char

https://spark.apache.org/docs/latest/sql-ref-syntax-qry-select-like.html

[ NOT ] { LIKE search_pattern [ ESCAPE esc_char ] | [ RLIKE | REGEXP ] regex_pattern } ...

...

esc_char
Specifies the escape character. The default escape character is .

LuciferYang · 2025-11-10T14:01:24Z

...c/src/main/scala/org/apache/spark/sql/connect/client/jdbc/SparkConnectDatabaseMetaData.scala

+  private def getSchemasDataFrame(
+      catalog: String, schemaPattern: String): connect.DataFrame = {
+
+    val schemaFilterClause = if (isNullOrWildcard(schemaPattern)) {


how about

val schemaFilter = if (isNullOrWildcard(schemaPattern)) { lit(true) } else { col("TABLE_SCHEM").like(schemaPattern) }

thank you for suggestion, this sounds better, addressed in 711f602

dongjoon-hyun

+1, LGTM (Pending CIs).

…aData ### What changes were proposed in this pull request? Implement `getSchemas` methods defined in `java.sql.DatabaseMetaData` for `SparkConnectDatabaseMetaData`. ```java /** * Retrieves the schema names available in this database. The results * are ordered by {code TABLE_CATALOG} and * {code TABLE_SCHEM}. * * The schema columns are: * <OL> * <LI>TABLE_SCHEM String {code =>} schema name * <LI>TABLE_CATALOG String {code =>} catalog name (may be {code null}) * </OL> * * return a {code ResultSet} object in which each row is a * schema description * throws SQLException if a database access error occurs * */ ResultSet getSchemas() throws SQLException; /** * Retrieves the schema names available in this database. The results * are ordered by {code TABLE_CATALOG} and * {code TABLE_SCHEM}. * * The schema columns are: * <OL> * <LI>TABLE_SCHEM String {code =>} schema name * <LI>TABLE_CATALOG String {code =>} catalog name (may be {code null}) * </OL> * * * param catalog a catalog name; must match the catalog name as it is stored * in the database;"" retrieves those without a catalog; null means catalog * name should not be used to narrow down the search. * param schemaPattern a schema name; must match the schema name as it is * stored in the database; null means * schema name should not be used to narrow down the search. * return a {code ResultSet} object in which each row is a * schema description * throws SQLException if a database access error occurs * see #getSearchStringEscape * since 1.6 */ ResultSet getSchemas(String catalog, String schemaPattern) throws SQLException; ``` ### Why are the changes needed? Enhance API coverage of the Connect JDBC driver, for example, `get[Catalogs|Schemas|Tables|...]` APIs are used by SQL GUI tools such as DBeaver for displaying the tree category. ### Does this PR introduce _any_ user-facing change? No, Connect JDBC driver is a new feature under development. ### How was this patch tested? New UT is added, also tested via DBeaver - the catalog/schema tree works now. <img width="1260" height="892" alt="Xnip2025-11-01_01-33-38" src="https://github.com/user-attachments/assets/ca678627-e07c-430a-9750-e7ea1d69aecf" /> ### Was this patch authored or co-authored using generative AI tooling? No. Closes #52819 from pan3793/SPARK-54112. Authored-by: Cheng Pan <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit 52fe51b) Signed-off-by: Dongjoon Hyun <[email protected]>

dongjoon-hyun · 2025-11-10T18:05:13Z

Merged to master/4.1 for Apache Spark 4.1.0. Thank you, @pan3793 and @LuciferYang .

github-actions bot added SQL CONNECT labels Oct 31, 2025

pan3793 commented Oct 31, 2025

View reviewed changes

pan3793 force-pushed the SPARK-54112 branch 2 times, most recently from ee3d78f to f55afd6 Compare November 3, 2025 06:47

pan3793 marked this pull request as ready for review November 3, 2025 06:47

LuciferYang reviewed Nov 5, 2025

View reviewed changes

pan3793 added 2 commits November 10, 2025 15:03

[SPARK-54112][CONNECT] Support getSchemas for SparkConnectDatabaseMet…

1d5670e

…aData

address comments

5dcb496

pan3793 force-pushed the SPARK-54112 branch from f55afd6 to 5dcb496 Compare November 10, 2025 11:58

duplicated TABLE_CATALOG

93ae4ca

pan3793 commented Nov 10, 2025

View reviewed changes

LuciferYang reviewed Nov 10, 2025

View reviewed changes

LuciferYang approved these changes Nov 10, 2025

View reviewed changes

use dataframe api

711f602

dongjoon-hyun approved these changes Nov 10, 2025

View reviewed changes

dongjoon-hyun closed this in 52fe51b Nov 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-54112][CONNECT] Support getSchemas for SparkConnectDatabaseMetaData #52819

[SPARK-54112][CONNECT] Support getSchemas for SparkConnectDatabaseMetaData #52819

Uh oh!

pan3793 commented Oct 31, 2025 •

edited

Loading

Uh oh!

pan3793 Oct 31, 2025

Uh oh!

pan3793 commented Nov 3, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dongjoon-hyun commented Nov 8, 2025

Uh oh!

pan3793 Nov 10, 2025

Uh oh!

LuciferYang Nov 10, 2025

Uh oh!

pan3793 Nov 10, 2025

Uh oh!

dongjoon-hyun left a comment

Uh oh!

dongjoon-hyun commented Nov 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[SPARK-54112][CONNECT] Support getSchemas for SparkConnectDatabaseMetaData #52819

[SPARK-54112][CONNECT] Support getSchemas for SparkConnectDatabaseMetaData #52819

Uh oh!

Conversation

pan3793 commented Oct 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

pan3793 Oct 31, 2025

Choose a reason for hiding this comment

Uh oh!

pan3793 commented Nov 3, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dongjoon-hyun commented Nov 8, 2025

Uh oh!

pan3793 Nov 10, 2025

Choose a reason for hiding this comment

Uh oh!

LuciferYang Nov 10, 2025

Choose a reason for hiding this comment

Uh oh!

pan3793 Nov 10, 2025

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun commented Nov 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pan3793 commented Oct 31, 2025 •

edited

Loading