Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: increase default type precision in redshift/mssql view workaround #2791

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

treysp
Copy link
Contributor

@treysp treysp commented Jun 18, 2024

Redshift and MSSQL use the VarcharSizeWorkaroundMixin to infer the size of a varchar column when it cannot be resolved from model code.

Data may be truncated if it is inferred to the default length (256 in Redshift, 1 in MSSQL) and that was not intended by the user, so this PR increases the length to "max" if a column with the default length is detected.

The most likely cause of unintentional default length is transpilation from an engine where the bare type's default is unlimited. For example, in postgres VARCHAR without length parameter has unlimited length, so transpilation to Redshift VARCHAR with default length 256 results in unintentionally truncating the field.

@treysp treysp requested a review from izeigerman June 18, 2024 22:29
@treysp treysp changed the title Feat: increase default type precision in redshift/mssql view workaround WIP Feat: increase default type precision in redshift/mssql view workaround Jun 18, 2024
@treysp treysp force-pushed the trey/schema-diff-precision branch 2 times, most recently from 8138a6a to 71d76af Compare June 20, 2024 19:36
Base automatically changed from trey/schema-diff-precision to main June 20, 2024 20:21
@treysp treysp force-pushed the trey/varchar-workaround-max branch from 9fcc19f to 074c1a9 Compare June 20, 2024 20:44
@treysp treysp marked this pull request as ready for review June 20, 2024 20:45
@treysp treysp changed the title WIP Feat: increase default type precision in redshift/mssql view workaround Feat: increase default type precision in redshift/mssql view workaround Jun 20, 2024
@treysp treysp force-pushed the trey/varchar-workaround-max branch from 45a20f2 to 658ed8b Compare June 21, 2024 14:12
) -> t.Dict[str, exp.DataType]:
# get default lengths for types that support "max" length
types_with_max_default_param = {
k: [self.SCHEMA_DIFFER.parameterized_type_defaults[k][0][0]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we care about all types. I think we only care about Varchar don't we?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this simplify the implementation?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or do we think the same issue applies to types VARBINARY?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured it could happen for any of these types due to transpilation, but VARCHAR is much more widely used than NVARCHAR/CHAR/VARBINARY so in practice it's probably the only one that matters. I'll update.

@treysp treysp force-pushed the trey/varchar-workaround-max branch from 658ed8b to b719b06 Compare June 24, 2024 14:52
parameter = self.SCHEMA_DIFFER.get_type_parameters(col_type)
type_default = types_with_max_default_param[col_type.this]
if parameter == type_default:
col_type.set("expressions", [exp.DataTypeParam(this=exp.Var(this="max"))])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
col_type.set("expressions", [exp.DataTypeParam(this=exp.Var(this="max"))])
col_type.set("expressions", [exp.DataTypeParam(this=exp.var("max"))])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants