Replies: 1 comment 2 replies
-
With the issues I've been hitting with schema(/column) name conforming, I really wonder if we should be messing with column names much. I understand potentially changing all columns to lowercase by default but anything beyond that seems like a nasty can of worms. Allow folks to override the column names in the catalog seems more clean. We could also put some of the more "crazy" column conforming stuff behind a flag like |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
This has come up in development of
target-snowflake
but applies more generally to warehouse and some database targets.The key issue is that stream property identifiers are much more permissive than database object identifiers (database, schema, table and column names). e.g.
is perfectly valid from a Singer perspective. However:
select
is a reserved word in many databases, requiring escaping/quoting.Select != select
in JSON terms, but Snowflake and Redshift are case insensitive meaning both will map to equal values if unquoted. Warehouses (at least Snowflake and Redshift) treats quoted values particularly poorly - objects created with double-quotes must always be referred to with those quotes thereafter, which leads to a terrible user experience down stream (in dbt and BI tools).For these and other similar cases, commonly used targets (like
pipelinewise-target-snowflake
) have methods for conforming stream property identifiers according to the target systems constraints. Currently, in the SDK, no transformation is done by default; property identifiers are passed directly to SQLAlchemy, which tries to resolve any issues using quoting (which again is suboptimal in common warehouses). The only way to interject transformations is to overload all the methods that access stream properties (lots of them).Therefore, to better support SQL target types, it would be useful to allow users to provide a
conform_identifier
method (or similar) as part of their target implementation that the SDK uses when creating/updating/inserting objects into the database.References:
Redshift constraints
Snowflake constraints
Beta Was this translation helpful? Give feedback.
All reactions