1011 Disable full stack trace when using spark connect #1012

b1ackout · 2024-05-10T10:16:03Z

Describe your changes

Updated is_non_sqlalchemy_error to handle pyspark.sql.utils.AnalysisException, thus when short_errors is enabled to show only the spark sql error and not the full stack trace

Issue number

Closes #1011

Checklist before requesting a review

Performed a self-review of my code
Formatted my code with pkgmt format
Added tests (when necessary).
Added docstring documentation and update the changelog (when needed)

📚 Documentation preview 📚: https://jupysql--1012.org.readthedocs.build/en/1012/

edublancas · 2024-05-21T00:50:23Z

src/sql/run/sparkdataframe.py

        raise exceptions.MissingPackageError("pysark not installed")

-    return SparkResultProxy(dataframe, dataframe.columns, should_cache)
+    try:


please integrate this with short_errors:

jupysql/src/sql/magic.py

Line 175 in 0433444

short_errors = Bool(

by default, it should raise the exception, if short_errors is True, then just print it

I see this was marked as resolved but I don't see any references to the short_errors option in your PR, am I missing anything?

edublancas · 2024-05-21T00:50:43Z

src/sql/run/sparkdataframe.py

-    return SparkResultProxy(dataframe, dataframe.columns, should_cache)
+    try:
+        return SparkResultProxy(dataframe, dataframe.columns, should_cache)
+    except AnalysisException as e:


this except is redundant, the except Exception as e can catch all exceptions

b1ackout · 2024-05-23T07:37:09Z

src/sql/util.py

@@ -559,6 +559,7 @@ def is_non_sqlalchemy_error(error):
        # Pyspark
        "UNRESOLVED_ROUTINE",
        "PARSE_SYNTAX_ERROR",
+        "AnalysisException",


After looking through the code I think adding AnalysisException here will solve the issue since PARSE_SYNTAX_ERROR works as expected.
AnalysisException covers all these error conditions.

I just need to test it somehow. Will try to package the jupysql and install it in a spark environment

you can install like this:

pip install git+https://github.com/b1ackout/jupysql@running-sql-using-sparkconnect-should-not-print-full-stack-trace

I tested it but no luck, the error message doesn't contain AnalysisException, only the sql error conditions listed here which are included in AnalysisException. So instead of included all these in the list (which could also be updated regularly) I think checking if the error is of instance of AnalysisException would be a lot cleaner.

Tested it and it works.

b1ackout · 2024-05-27T09:19:35Z

src/sql/run/sparkdataframe.py

@@ -9,9 +9,9 @@


 def handle_spark_dataframe(dataframe, should_cache=False):
-    """Execute a ResultSet sqlaproxy using pysark module."""
+    """Execute a ResultSet sqlaproxy using pyspark module."""


b1ackout · 2024-05-27T09:21:03Z

src/sql/util.py

@@ -556,11 +562,14 @@ def is_non_sqlalchemy_error(error):
        "pyodbc.ProgrammingError",
        # Clickhouse errors
        "DB::Exception:",
-        # Pyspark


Removed these as they are included in AnalysisException

edublancas · 2024-07-12T18:12:29Z

seems like you're still working on this PR, please review our contribution guidelines: https://ploomber-contributing.readthedocs.io/en/latest/contributing/responding-pr-review.html

specifically the point about addressing comments. please post links to the code changes so we can review them one by want. I also saw you marked some discussions as resolved, please unmark them so I can review them

edublancas · 2024-07-18T17:35:51Z

closing due to inactivity

b1ackout · 2024-09-02T10:36:57Z

src/sql/run/sparkdataframe.py

    if not DataFrame and not CDataFrame:
-        raise exceptions.MissingPackageError("pysark not installed")
+        raise exceptions.MissingPackageError("pyspark not installed")


b1ackout · 2024-09-02T10:38:21Z

src/sql/util.py

+try:
+    from pyspark.sql.utils import AnalysisException
+except ModuleNotFoundError:
+    AnalysisException = None


This is to handle the case where pyspark module is not installed

b1ackout · 2024-09-02T10:38:44Z

src/sql/util.py

+    is_pyspark_analysis_exception = (
+        isinstance(error, AnalysisException) if AnalysisException else False
+    )
+    return (
+        any(msg in str(error) for msg in specific_db_errors)
+        or is_pyspark_analysis_exception
+    )


If AnalysisException is imported then checks if the error is of instance of pyspark's Analysis Exception and handles it accordingly

b1ackout · 2024-09-02T10:39:56Z

edublancas sorry, I was away on paternity, I opened a new one: #1024

b1ackout added 2 commits May 9, 2024 14:42

Prune sparkConnect SQL stack trace on AnalysisException

bf00376

Add AnalysisException on MissingPackageError check

1143d25

b1ackout requested a review from edublancas as a code owner May 10, 2024 10:16

Lint

d7e0e8f

edublancas requested changes May 21, 2024

View reviewed changes

edublancas added the feature Adds a new feature label May 21, 2024

b1ackout added 4 commits May 23, 2024 10:18

Revert changes

11bedfa

Fix typo

7578d8c

Add AnalysisException to is_non_sqlalchemy_error

7102f97

Lint

48c1623

b1ackout marked this pull request as draft May 23, 2024 07:34

Fix typo

d951b42

b1ackout commented May 23, 2024

View reviewed changes

Update is_non_sqlalchemy_error to support pysparks AnalysisException

e95d599

b1ackout commented May 27, 2024

View reviewed changes

b1ackout requested a review from edublancas May 27, 2024 09:25

b1ackout marked this pull request as ready for review May 27, 2024 09:26

b1ackout changed the title ~~1011 Running sql using sparkconnect should not print full stack trace~~ 1011 Disable full stack trace when using spark connect Jul 8, 2024

Add Changelog

1dcfb4f

edublancas closed this Jul 18, 2024

b1ackout commented Sep 2, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1011 Disable full stack trace when using spark connect #1012

1011 Disable full stack trace when using spark connect #1012

b1ackout commented May 10, 2024 •

edited

Loading

edublancas May 21, 2024

edublancas Jun 5, 2024

edublancas May 21, 2024

b1ackout May 23, 2024

b1ackout May 23, 2024

edublancas May 23, 2024

b1ackout May 27, 2024 •

edited

Loading

b1ackout May 27, 2024

b1ackout May 27, 2024

edublancas commented Jul 12, 2024 •

edited

Loading

edublancas commented Jul 18, 2024

b1ackout Sep 2, 2024

b1ackout Sep 2, 2024

b1ackout Sep 2, 2024

b1ackout commented Sep 2, 2024 •

edited

Loading

1011 Disable full stack trace when using spark connect #1012

1011 Disable full stack trace when using spark connect #1012

Conversation

b1ackout commented May 10, 2024 • edited Loading

Describe your changes

Issue number

Checklist before requesting a review

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

b1ackout May 27, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

edublancas commented Jul 12, 2024 • edited Loading

edublancas commented Jul 18, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

b1ackout commented Sep 2, 2024 • edited Loading

b1ackout commented May 10, 2024 •

edited

Loading

b1ackout May 27, 2024 •

edited

Loading

edublancas commented Jul 12, 2024 •

edited

Loading

b1ackout commented Sep 2, 2024 •

edited

Loading