You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The more I think about the importance of data contracts ensuring coverage checks as part of a team's workflow feels like a natural evolution of this pattern.
Context
The way I see this, there are two standards a user should aim for:
A "gold standard 🥇" pattern where every dataset in your project has pandera schemas attached (all parameter inputs also have pandera/pydantic definitions too)
A "silver standard 🥈" pattern where just the free-inputs/outputs of a pipeline are properly validated and the rest is treated a closed box.
Possible Implementation
Build an AST introspection utility which uses an instantiated KedroSession object to validate state
Possible Alternatives
Look at building a Pylint plugin to do the same thing
The text was updated successfully, but these errors were encountered:
I like the idea, but why would we need AST introspection? Couldn't just we if all the dataset in a pipeline have a schema attached in their metadata? Is it related to the decorator way to trigger data checks?
So I was thinking about supporting the Python annotations (as well as the catalog metadata), but we don't actually have to do that using static analysis we can actually just inspect the live objects at pipeline creation time.
Description
The more I think about the importance of data contracts ensuring coverage checks as part of a team's workflow feels like a natural evolution of this pattern.
Context
The way I see this, there are two standards a user should aim for:
Possible Implementation
Possible Alternatives
The text was updated successfully, but these errors were encountered: