-
Notifications
You must be signed in to change notification settings - Fork 1k
feat(duckDB): Cast inputs (BLOB → VARCHAR) for duckDB STARTS_WITH #6240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat(duckDB): Cast inputs (BLOB → VARCHAR) for duckDB STARTS_WITH #6240
Conversation
a99ba1f to
93b7efa
Compare
georgesittas
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should mimic the implementation for Lower and Upper. Take a look at _case_conversion_sql.
82a535d to
fd278a0
Compare
georgesittas
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one comment, looks good otherwise. Thanks!
| # Annotate types if needed for type-based casting | ||
| if not arg.type: | ||
| from sqlglot.optimizer.annotate_types import annotate_types | ||
|
|
||
| annotate_types(arg, dialect=self.dialect) | ||
|
|
||
| # Convert ByteString to String literal before generation | ||
| # ByteStrings get typed as UNKNOWN and would be wrapped in CAST(...AS BLOB) by generator | ||
| if isinstance(arg, exp.ByteString): | ||
| arg.replace(exp.Literal.string(arg.this)) | ||
| # Cast non-VARCHAR types to VARCHAR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this logic is necessary?
- We don't have to annotate types here, since transpiling this successfully from bigquery to duckdb requires that we run type inference first, and hence we should have the types available. For duckdb -> duckdb, we don't need types– the varchar/unknown branch should suffice since the type will always be unknown and we won't cast, hence preserving the sql @ roundtrip
- Doesn't look like we should deal with bytestring as a special case. Despite a bit verbose, doing a double-cast is probably the safer way to go.
When
STARTS_WITHis used withBLOB/BYTEStypes that are not literals (e.g., from CAST, table columns, function results), the transpiled DuckDB query fails because DuckDB's starts_with only acceptsVARCHAR.Before:
After: