-
Notifications
You must be signed in to change notification settings - Fork 3.3k
DRAFT - feat(ingestion/transformers): add is_container config to tags transformers to enable container tagging #14290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…rmers to enable container tagging
| if not tags_to_add: | ||
| continue | ||
|
|
||
| assert self.ctx.graph |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dangerous use of assert - low severity
When running Python in production in optimized mode, assert calls are not executed. This mode is enabled by setting the PYTHONOPTIMIZE command line flag. Optimized mode is usually ON in production. Any safety check done using assert will not be executed.
Remediation: Raise an exception instead of using assert.
View details in Aikido Security
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
|
|
||
| class AddDatasetTagsConfig(TransformerSemanticsConfigModel): | ||
| get_tags_to_add: Callable[[str], List[TagAssociationClass]] | ||
| is_container: bool = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know we've used this is_container flag pattern in other places, but frankly I find it really confusing that the dataset tag transformer has a container-related option.
I'd like us to spend a little bit of time thinking about if there is a better way to do this that results in a more logically consistent interface / less confusing experience.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In particular, it's important to me that we continue to reuse as much code as possible around the logic of merging ingestion-produced tags with server-fetch tags. But I don't think stuffing all of that functionality in a dataset transformer is the right approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know we've used this is_container flag pattern in other places, but frankly I find it really confusing that the dataset tag transformer has a container-related option.
I'd like us to spend a little bit of time thinking about if there is a better way to do this that results in a more logically consistent interface / less confusing experience.
I do agree with the comment and sentiment. My general feel is that the OOTB transformers, in being limited to datasets, are restrictive. There is functionality within the existing transformers that would make sense to be applied to other entity types (e.g. dashboards, containers etc.). Because of this it feels like the is_container functionality has generally been added to get around these imposed restrictions on the OOTB transformers. I do feel that this change does at least offer consistency with other transformers, hence raising it.
It does feel like an area for us to revisit - what the future state of transformers should be, and how they should be used, and what entity types should be allowed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In particular, it's important to me that we continue to reuse as much code as possible around the logic of merging ingestion-produced tags with server-fetch tags. But I don't think stuffing all of that functionality in a dataset transformer is the right approach.
What would you propose as next steps? Is there existing code logic that this should leverage or would you prefer for this PR to start on the creation of this common code?
No description provided.