Replies: 9 comments
-
Thanks very much Alexander. I see and understand the fixes you are proposing, and believe they provide a meaningful improvement to the Superset community in that our Teradata customers will now have the option to use Superset with Teradata. |
Beta Was this translation helpful? Give feedback.
-
Great to see the SQLAlchemy driver updated! I think we might not need a SIP for that, it's a clear win and not controversial, though it might require a new major release. Can the new driver co-exist with the old one?
This is an easy fix: class TeradataEngineSpec(BaseEngineSpec):
"""Dialect for Teradata DB."""
limit_method = LimitMethod.FETCH_MANY
... This error happens because in SQL Lab we try to limit the query to prevent fetching too much data. There are 3 strategies for this:
SELECT * FROM (
$SQL_QUERY
) LIMIT 1000; This method is great because it doesn't require parsing the SQL query, and is mostly foolproof. Not all databases support this, though.
|
Beta Was this translation helpful? Give feedback.
-
Also, fell free to create a draft PR with the proposed changes, it's easier to discuss it on the PR. |
Beta Was this translation helpful? Give feedback.
-
I'm not sure if we can list it as an optional dependency (like we do for other databases) if the license is not compatible with ASF. |
Beta Was this translation helpful? Give feedback.
-
Thanks very much Alexander. I see and understand the fixes you are proposing, and believe they provide a meaningful improvement to the Superset community in that our Teradata customers will now have the option to use Superset with Teradata. |
Beta Was this translation helpful? Give feedback.
-
Since @classmethod
def apply_limit_to_sql(cls, sql: str, limit: int, database: "Database") -> str:
"""
Alters the SQL statement to apply a LIMIT clause
:param sql: SQL query
:param limit: Maximum number of rows to be returned by the query
:param database: Database instance
:return: SQL query with limit clause
"""
# TODO: Fix circular import caused by importing Database
if cls.limit_method == LimitMethod.WRAP_SQL:
sql = sql.strip("\t\n ;")
qry = (
select("*")
.select_from(TextAsFrom(text(sql), ["*"]).alias("inner_qry"))
.limit(limit)
)
return database.compile_sqla_query(qry)
if cls.limit_method == LimitMethod.FORCE_LIMIT:
parsed_query = sql_parse.ParsedQuery(sql)
sql = parsed_query.set_or_update_query_limit(limit)
return sql You could implement a custom method in This approach would be better than the one you have modifying |
Beta Was this translation helpful? Give feedback.
-
@betodealmeida Thanks for your comments. I created a pull request as suggested. We did you look briefly at the custom class "TD_TOP" we go unhung up on the apply_limit and update_limt with TOP in the raw sql text. Let us know if that is the approach that can get approved. here is the link to my pull request #13519 |
Beta Was this translation helpful? Give feedback.
-
@mccushjack awesome, thanks! Let me take a look at the PR. |
Beta Was this translation helpful? Give feedback.
-
Edited the original post to provide more clarity regarding the teradatasql (and teradatasqlalchemy) license files. I also explicitly attached these license files in the "Attachments" section at the bottom of the post. |
Beta Was this translation helpful? Give feedback.
-
Motivation
In attempting to use Apache Superset with Teradata, we observed the following:
Figure 1
The current SIP is a proposal to update Superset connectivity to Teradata Vantage systems, and to correct the way the 'LIMIT' keyword in the Superset back-end SQL is interpreted by Teradata SQL. It is likely that the latter fix might also provide a fix for issue #11405, too.
Proposed Changes
For Topic 1, the following is proposed:
We recommend updating the current (as of March 5, 2021) Superset Teradata connection webpage at https://superset.apache.org/docs/databases/teradata so that
This proposal simplifies Teradata connections, by rendering as obsolete the previously necessary steps to download the Teradata ODBC drivers and to specify corresponding environment variables. The current proposal only requires installation of the teradatasqlalchemy package, as teradatasql is a dependency of teradatasqlalchemy and is satisfied upon installation of teradatasqlalchemy. Alternatively, if users have already installed the teradataml package (that is, the Teradata client package for Python) on their client computer, then no further action is needed.
In all, the proposal is summarized into the single action of replacing the existing text on the page https://superset.apache.org/docs/databases/teradata with the following text:
A sample connection string to a Teradata Vantage system by using teradatasqlalchemy with the teradatasql driver is shown in the Figure 2 below.
Figure 2
For Topic 2, the following is proposed:
My colleague David Chan identified and implemented what needs to be done in the existing Superset code, so that the 'LIMIT' keyword is correctly identified and translated to Teradata SQL, as follows:
The proposed solution is non-destructive, in that it introduces new code that is executed only upon identifying a connection as a Teradata connection. The corresponding fix is provided in the modified versions of the files "db_engine_specs/base.py", "sql_lab.py", and "sql_parse.py", which are included in the file "SIP_Teradata.zip" at the bottom of the present SIP. The listing of differences between the original files and the proposed fix files is detailed in the text file "codeDifferences.txt", which is also included in the file "SIP_Teradata.zip" at the bottom of the present SIP.
By using the fix files in place of the original ones, the issue identified above in Figure 1 is resolved as illustrated in the following Figure 3.
Figure 3
New or Changed Public Interfaces
The present proposal requires no new or changed public interfaces.
New dependencies
Only for Teradata connections, the present proposal requires installation of
packages to connect to a Teradata Vantage system Advanced SQL Engine Database.
These packages are actively maintained and their licenses are Teradata proprietary. The teradatasql and teradatasqlalchemy license files are attached below at the bottom of this posting. Upon installing these packages on a computer, these license files can be also found in the corresponding package directory under the main Python installation directory.
Migration Plan and Compatibility
No migration is necessary. The proposed code changes are an extension and non-destructive to the existing code, thus not affecting existing compatibility.
Rejected Alternatives
No alternative approaches have been considered or rejected.
Attachments
SIP_Teradata.zip
LICENSE_teradatasql.txt
LICENSE_teradatasqlalchemy.txt
Beta Was this translation helpful? Give feedback.
All reactions