Skip to content

Conversation

@pan3793
Copy link
Member

@pan3793 pan3793 commented Oct 31, 2025

What changes were proposed in this pull request?

Implement getSchemas methods defined in java.sql.DatabaseMetaData for SparkConnectDatabaseMetaData.

    /**
     * Retrieves the schema names available in this database.  The results
     * are ordered by {@code TABLE_CATALOG} and
     * {@code TABLE_SCHEM}.
     *
     * <P>The schema columns are:
     *  <OL>
     *  <LI><B>TABLE_SCHEM</B> String {@code =>} schema name
     *  <LI><B>TABLE_CATALOG</B> String {@code =>} catalog name (may be {@code null})
     *  </OL>
     *
     * @return a {@code ResultSet} object in which each row is a
     *         schema description
     * @throws SQLException if a database access error occurs
     *
     */
    ResultSet getSchemas() throws SQLException;

    /**
     * Retrieves the schema names available in this database.  The results
     * are ordered by {@code TABLE_CATALOG} and
     * {@code TABLE_SCHEM}.
     *
     * <P>The schema columns are:
     *  <OL>
     *  <LI><B>TABLE_SCHEM</B> String {@code =>} schema name
     *  <LI><B>TABLE_CATALOG</B> String {@code =>} catalog name (may be {@code null})
     *  </OL>
     *
     *
     * @param catalog a catalog name; must match the catalog name as it is stored
     * in the database;"" retrieves those without a catalog; null means catalog
     * name should not be used to narrow down the search.
     * @param schemaPattern a schema name; must match the schema name as it is
     * stored in the database; null means
     * schema name should not be used to narrow down the search.
     * @return a {@code ResultSet} object in which each row is a
     *         schema description
     * @throws SQLException if a database access error occurs
     * @see #getSearchStringEscape
     * @since 1.6
     */
    ResultSet getSchemas(String catalog, String schemaPattern) throws SQLException;

Why are the changes needed?

Enhance API coverage of the Connect JDBC driver, for example, get[Catalogs|Schemas|Tables|...] APIs are used by SQL GUI tools such as DBeaver for displaying the tree category.

Does this PR introduce any user-facing change?

No, Connect JDBC driver is a new feature under development.

How was this patch tested?

New UT is added, also tested via DBeaver - the catalog/schema tree works now.

Xnip2025-11-01_01-33-38

Was this patch authored or co-authored using generative AI tooling?

No.

override def dataDefinitionIgnoredInTransactions: Boolean = false

private def isNullOrWildcard(pattern: String): Boolean =
pattern == null || pattern == "%"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is used to test whether fooPattern matches ALL

https://docs.oracle.com/en/java/javase/17/docs/api/java.sql/java/sql/DatabaseMetaData.html

Some DatabaseMetaData methods take arguments that are String patterns. These arguments all have names such as fooPattern. Within a pattern String, "%" means match any substring of 0 or more characters, and "_" means match any one character. Only metadata entries matching the search pattern are returned. If a search pattern argument is set to null, that argument's criterion will be dropped from the search.

@pan3793 pan3793 force-pushed the SPARK-54112 branch 2 times, most recently from ee3d78f to f55afd6 Compare November 3, 2025 06:47
@pan3793 pan3793 marked this pull request as ready for review November 3, 2025 06:47
@pan3793
Copy link
Member Author

pan3793 commented Nov 3, 2025

this is ready for review, cc @LuciferYang

@dongjoon-hyun
Copy link
Member

Gentle ping, @pan3793 .


override def getSearchStringEscape: String =
throw new SQLFeatureNotSupportedException
override def getSearchStringEscape: String = "\\"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spark SQL uses backslash as the default escape char for LIKE expression, it also supports custom escape char

https://spark.apache.org/docs/latest/sql-ref-syntax-qry-select-like.html

[ NOT ] { LIKE search_pattern [ ESCAPE esc_char ] | [ RLIKE | REGEXP ] regex_pattern }
...

...

  • esc_char
    Specifies the escape character. The default escape character is .

private def getSchemasDataFrame(
catalog: String, schemaPattern: String): connect.DataFrame = {

val schemaFilterClause = if (isNullOrWildcard(schemaPattern)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about

    val schemaFilter = if (isNullOrWildcard(schemaPattern)) {
      lit(true)
    } else {
      col("TABLE_SCHEM").like(schemaPattern)
    }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for suggestion, this sounds better, addressed in 711f602

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM (Pending CIs).

dongjoon-hyun pushed a commit that referenced this pull request Nov 10, 2025
…aData

### What changes were proposed in this pull request?

Implement `getSchemas` methods defined in `java.sql.DatabaseMetaData` for `SparkConnectDatabaseMetaData`.

```java
    /**
     * Retrieves the schema names available in this database.  The results
     * are ordered by {code TABLE_CATALOG} and
     * {code TABLE_SCHEM}.
     *
     * <P>The schema columns are:
     *  <OL>
     *  <LI><B>TABLE_SCHEM</B> String {code =>} schema name
     *  <LI><B>TABLE_CATALOG</B> String {code =>} catalog name (may be {code null})
     *  </OL>
     *
     * return a {code ResultSet} object in which each row is a
     *         schema description
     * throws SQLException if a database access error occurs
     *
     */
    ResultSet getSchemas() throws SQLException;

    /**
     * Retrieves the schema names available in this database.  The results
     * are ordered by {code TABLE_CATALOG} and
     * {code TABLE_SCHEM}.
     *
     * <P>The schema columns are:
     *  <OL>
     *  <LI><B>TABLE_SCHEM</B> String {code =>} schema name
     *  <LI><B>TABLE_CATALOG</B> String {code =>} catalog name (may be {code null})
     *  </OL>
     *
     *
     * param catalog a catalog name; must match the catalog name as it is stored
     * in the database;"" retrieves those without a catalog; null means catalog
     * name should not be used to narrow down the search.
     * param schemaPattern a schema name; must match the schema name as it is
     * stored in the database; null means
     * schema name should not be used to narrow down the search.
     * return a {code ResultSet} object in which each row is a
     *         schema description
     * throws SQLException if a database access error occurs
     * see #getSearchStringEscape
     * since 1.6
     */
    ResultSet getSchemas(String catalog, String schemaPattern) throws SQLException;
```

### Why are the changes needed?

Enhance API coverage of the Connect JDBC driver, for example, `get[Catalogs|Schemas|Tables|...]` APIs are used by SQL GUI tools such as DBeaver for displaying the tree category.

### Does this PR introduce _any_ user-facing change?

No, Connect JDBC driver is a new feature under development.

### How was this patch tested?

New UT is added, also tested via DBeaver - the catalog/schema tree works now.

<img width="1260" height="892" alt="Xnip2025-11-01_01-33-38" src="https://github.com/user-attachments/assets/ca678627-e07c-430a-9750-e7ea1d69aecf" />

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #52819 from pan3793/SPARK-54112.

Authored-by: Cheng Pan <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 52fe51b)
Signed-off-by: Dongjoon Hyun <[email protected]>
@dongjoon-hyun
Copy link
Member

Merged to master/4.1 for Apache Spark 4.1.0. Thank you, @pan3793 and @LuciferYang .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants