-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-54112][CONNECT] Support getSchemas for SparkConnectDatabaseMetaData #52819
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| override def dataDefinitionIgnoredInTransactions: Boolean = false | ||
|
|
||
| private def isNullOrWildcard(pattern: String): Boolean = | ||
| pattern == null || pattern == "%" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is used to test whether fooPattern matches ALL
https://docs.oracle.com/en/java/javase/17/docs/api/java.sql/java/sql/DatabaseMetaData.html
Some DatabaseMetaData methods take arguments that are String patterns. These arguments all have names such as fooPattern. Within a pattern String, "%" means match any substring of 0 or more characters, and "_" means match any one character. Only metadata entries matching the search pattern are returned. If a search pattern argument is set to null, that argument's criterion will be dropped from the search.
ee3d78f to
f55afd6
Compare
|
this is ready for review, cc @LuciferYang |
...c/src/main/scala/org/apache/spark/sql/connect/client/jdbc/SparkConnectDatabaseMetaData.scala
Outdated
Show resolved
Hide resolved
...c/src/main/scala/org/apache/spark/sql/connect/client/jdbc/SparkConnectDatabaseMetaData.scala
Outdated
Show resolved
Hide resolved
...c/src/main/scala/org/apache/spark/sql/connect/client/jdbc/SparkConnectDatabaseMetaData.scala
Outdated
Show resolved
Hide resolved
...c/src/main/scala/org/apache/spark/sql/connect/client/jdbc/SparkConnectDatabaseMetaData.scala
Show resolved
Hide resolved
|
Gentle ping, @pan3793 . |
|
|
||
| override def getSearchStringEscape: String = | ||
| throw new SQLFeatureNotSupportedException | ||
| override def getSearchStringEscape: String = "\\" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spark SQL uses backslash as the default escape char for LIKE expression, it also supports custom escape char
https://spark.apache.org/docs/latest/sql-ref-syntax-qry-select-like.html
[ NOT ] { LIKE search_pattern [ ESCAPE esc_char ] | [ RLIKE | REGEXP ] regex_pattern } ......
- esc_char
Specifies the escape character. The default escape character is .
| private def getSchemasDataFrame( | ||
| catalog: String, schemaPattern: String): connect.DataFrame = { | ||
|
|
||
| val schemaFilterClause = if (isNullOrWildcard(schemaPattern)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about
val schemaFilter = if (isNullOrWildcard(schemaPattern)) {
lit(true)
} else {
col("TABLE_SCHEM").like(schemaPattern)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for suggestion, this sounds better, addressed in 711f602
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM (Pending CIs).
…aData
### What changes were proposed in this pull request?
Implement `getSchemas` methods defined in `java.sql.DatabaseMetaData` for `SparkConnectDatabaseMetaData`.
```java
/**
* Retrieves the schema names available in this database. The results
* are ordered by {code TABLE_CATALOG} and
* {code TABLE_SCHEM}.
*
* <P>The schema columns are:
* <OL>
* <LI><B>TABLE_SCHEM</B> String {code =>} schema name
* <LI><B>TABLE_CATALOG</B> String {code =>} catalog name (may be {code null})
* </OL>
*
* return a {code ResultSet} object in which each row is a
* schema description
* throws SQLException if a database access error occurs
*
*/
ResultSet getSchemas() throws SQLException;
/**
* Retrieves the schema names available in this database. The results
* are ordered by {code TABLE_CATALOG} and
* {code TABLE_SCHEM}.
*
* <P>The schema columns are:
* <OL>
* <LI><B>TABLE_SCHEM</B> String {code =>} schema name
* <LI><B>TABLE_CATALOG</B> String {code =>} catalog name (may be {code null})
* </OL>
*
*
* param catalog a catalog name; must match the catalog name as it is stored
* in the database;"" retrieves those without a catalog; null means catalog
* name should not be used to narrow down the search.
* param schemaPattern a schema name; must match the schema name as it is
* stored in the database; null means
* schema name should not be used to narrow down the search.
* return a {code ResultSet} object in which each row is a
* schema description
* throws SQLException if a database access error occurs
* see #getSearchStringEscape
* since 1.6
*/
ResultSet getSchemas(String catalog, String schemaPattern) throws SQLException;
```
### Why are the changes needed?
Enhance API coverage of the Connect JDBC driver, for example, `get[Catalogs|Schemas|Tables|...]` APIs are used by SQL GUI tools such as DBeaver for displaying the tree category.
### Does this PR introduce _any_ user-facing change?
No, Connect JDBC driver is a new feature under development.
### How was this patch tested?
New UT is added, also tested via DBeaver - the catalog/schema tree works now.
<img width="1260" height="892" alt="Xnip2025-11-01_01-33-38" src="https://github.com/user-attachments/assets/ca678627-e07c-430a-9750-e7ea1d69aecf" />
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #52819 from pan3793/SPARK-54112.
Authored-by: Cheng Pan <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 52fe51b)
Signed-off-by: Dongjoon Hyun <[email protected]>
|
Merged to master/4.1 for Apache Spark 4.1.0. Thank you, @pan3793 and @LuciferYang . |
What changes were proposed in this pull request?
Implement
getSchemasmethods defined injava.sql.DatabaseMetaDataforSparkConnectDatabaseMetaData.Why are the changes needed?
Enhance API coverage of the Connect JDBC driver, for example,
get[Catalogs|Schemas|Tables|...]APIs are used by SQL GUI tools such as DBeaver for displaying the tree category.Does this PR introduce any user-facing change?
No, Connect JDBC driver is a new feature under development.
How was this patch tested?
New UT is added, also tested via DBeaver - the catalog/schema tree works now.
Was this patch authored or co-authored using generative AI tooling?
No.