Skip to content

Commit 40a5126

Browse files
vladimirg-dbcloud-fan
authored andcommitted
[SPARK-54178][SQL] Improve error for ResolveSQLOnFile
### What changes were proposed in this pull request? Improve error for ResolveSQLOnFile - generic error does not mean that the data source is not supported! ### Why are the changes needed? Currently `ResolveSQLOnFile` throws `UNSUPPORTED_DATASOURCE_FOR_DIRECT_QUERY` for a generic failure when discovering files and figuring out the file schemas. This is confusing. We need a separate error. ### Does this PR introduce _any_ user-facing change? Better error message. ### How was this patch tested? Hard to create a test, because the files need to be corrupted. ### Was this patch authored or co-authored using generative AI tooling? Claude. Closes #52875 from vladimirg-db/vladimir-golubev_data/better-error-for-resolve-sql-on-file. Authored-by: Vladimir Golubev <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
1 parent 6b91d83 commit 40a5126

File tree

3 files changed

+16
-6
lines changed

3 files changed

+16
-6
lines changed

common/utils/src/main/resources/error/error-conditions.json

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1832,6 +1832,12 @@
18321832
],
18331833
"sqlState" : "2203G"
18341834
},
1835+
"FAILED_TO_CREATE_PLAN_FOR_DIRECT_QUERY" : {
1836+
"message" : [
1837+
"Failed to create plan for direct query on files: <dataSourceType>"
1838+
],
1839+
"sqlState" : "58030"
1840+
},
18351841
"FAILED_TO_LOAD_ROUTINE" : {
18361842
"message" : [
18371843
"Failed to load routine <routineName>."

sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1770,6 +1770,14 @@ private[sql] object QueryCompilationErrors extends QueryErrorsBase with Compilat
17701770
messageParameters = Map("provider" -> provider))
17711771
}
17721772

1773+
def failedToCreatePlanForDirectQueryError(
1774+
dataSourceType: String, cause: Throwable): Throwable = {
1775+
new AnalysisException(
1776+
errorClass = "FAILED_TO_CREATE_PLAN_FOR_DIRECT_QUERY",
1777+
messageParameters = Map("dataSourceType" -> dataSourceType),
1778+
cause = Some(cause))
1779+
}
1780+
17731781
def findMultipleDataSourceError(provider: String, sourceNames: Seq[String]): Throwable = {
17741782
new AnalysisException(
17751783
errorClass = "_LEGACY_ERROR_TEMP_1141",

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -65,12 +65,8 @@ class ResolveSQLOnFile(sparkSession: SparkSession) extends Rule[LogicalPlan] {
6565
messageParameters = e.getMessageParameters.asScala.toMap)
6666
case _: ClassNotFoundException => None
6767
case e: Exception if !e.isInstanceOf[AnalysisException] =>
68-
// the provider is valid, but failed to create a logical plan
69-
u.failAnalysis(
70-
errorClass = "UNSUPPORTED_DATASOURCE_FOR_DIRECT_QUERY",
71-
messageParameters = Map("dataSourceType" -> u.multipartIdentifier.head),
72-
cause = e
73-
)
68+
throw QueryCompilationErrors.failedToCreatePlanForDirectQueryError(
69+
u.multipartIdentifier.head, e)
7470
}
7571
case _ =>
7672
None

0 commit comments

Comments
 (0)