Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exclude 202 status from failure handling to support MSQ queries #325

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

vrrs
Copy link

@vrrs vrrs commented Oct 23, 2024

What

When submitting queries with pydruid to the new Druid MSQ engine, the client fails because it assumes that 202(non 200) status code indicates a failure. The MSQ engine whose endpoint is druid/v2/sql/task is an async API and returns the status of the query. Example {'taskId': 'query-cd91f1a2-0d15-4e2b-a858-3e2a7cc93374', 'state': 'RUNNING'}. With this small change, pydruid works as expected and returns one row.

Related to this issue

Error

[2024-10-23, 04:43:06 UTC] {taskinstance.py:2905} ERROR - Task failed with exception
Traceback (most recent call last):
  File "/opt/.pyenv/versions/3.9.8/envs/airflow/lib/python3.9/site-packages/airflow/providers/common/sql/hooks/sql.py", line 292, in get_first
    return self.run(sql=sql, parameters=parameters, handler=fetch_one_handler)
  File "/opt/.pyenv/versions/3.9.8/envs/airflow/lib/python3.9/site-packages/airflow/providers/common/sql/hooks/sql.py", line 418, in run
    self._run_command(cur, sql_statement, parameters)
  File "/opt/.pyenv/versions/3.9.8/envs/airflow/lib/python3.9/site-packages/airflow/providers/common/sql/hooks/sql.py", line 475, in _run_command
    cur.execute(sql_statement)
  File "/opt/.pyenv/versions/3.9.8/envs/airflow/lib/python3.9/site-packages/pydruid/db/api.py", line 62, in g
    return f(self, *args, **kwargs)
  File "/opt/.pyenv/versions/3.9.8/envs/airflow/lib/python3.9/site-packages/pydruid/db/api.py", line 256, in execute
    first_row = next(results)
  File "/opt/.pyenv/versions/3.9.8/envs/airflow/lib/python3.9/site-packages/pydruid/db/api.py", line 364, in _stream_query
    msg = "{error} ({category}): {errorMessage}".format(
KeyError: 'error'

Test

I ran the following snippet

from pydruid.db import connect

conn_params = {
  "endpoint": "druid/v2/sql/task",
  "schema": "https"
}
conn = connect(host="mybroker.net", port=2091, path=conn_params['endpoint'], scheme=conn_params['schema'])
curs = conn.cursor()
sql = """
SELECT 1
"""
curs.execute(sql)
for row in curs:
 print(row)

@vrrs
Copy link
Author

vrrs commented Oct 25, 2024

@gianm would really appreciate the review or tagging someone who might.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant