-
Notifications
You must be signed in to change notification settings - Fork 971
Description
Description
In the reference project kedro-agentic-workflows, we needed to create a custom dataset (SQLAlchemyEngineDataset) that returns a db_engine. This allowed us to handle writes and updates to the database within agentic workflows (e.g., creating claims, updating sessions, logging interactions).
While this approach works, it feels like a workaround. The current Kedro SQL datasets primarily focus on read-only workflows (e.g., SQLQueryDataSet) or batch inserts. They do not directly support upsert (insert + update) patterns, which are increasingly common when dealing with LLM-driven workflows, streaming data, or session management.
The goal of this ticket is to explore how Kedro can better support upserts in SQL datasets without requiring users to drop down to engine-level operations and whether it makes sense from the framework perspective.
Context
- In LLM/agent workflows, we frequently need to:
- Create new records (e.g., new claims).
- Update existing records (e.g., session end timestamps, claim statuses).
- Log events incrementally rather than in bulk.
- Current datasets require manual
SQLAlchemyconnections or custom datasets. - A first-class Upsert-capable SQL dataset could simplify workflows and make them more idiomatic within Kedro.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status