fix(mcp): add dataset_id to update_chart to support rebinding chart datasource#40853
fix(mcp): add dataset_id to update_chart to support rebinding chart datasource#40853aminghadersohi wants to merge 2 commits into
Conversation
…atasource UpdateChartRequest lacked a dataset_id field, so Pydantic silently dropped the parameter whenever the LLM passed it. _build_update_payload always fell back to chart.datasource_id and never included datasource_id/datasource_type in the UpdateChartCommand payload, causing the tool to return success: true while the chart stayed bound to its original dataset. Added dataset_id: int | None to UpdateChartRequest and plumbed it through _build_update_payload (includes datasource_id/datasource_type in payload), _build_preview_form_data (updates datasource field), _validate_update_against_dataset (validates against the new dataset when provided), and _create_preview_url (uses new datasource for form_data caching). Also allows dataset-only updates without requiring config or chart_name.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #40853 +/- ##
==========================================
- Coverage 64.14% 63.52% -0.63%
==========================================
Files 2652 2653 +1
Lines 143488 143766 +278
Branches 33110 33161 +51
==========================================
- Hits 92042 91324 -718
- Misses 49837 50824 +987
- Partials 1609 1618 +9
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
From my agent reviewer: Overall: solid fix, security model intact. One real bug to fix before merge. 🔴 High — Rename silently dropped 🟡 Medium — Dataset-only rebind skips validation 🟢 Low — datasource_type hardcoded to "table" 🟢 Low — DRY 🟢 Low — Test gaps |
There was a problem hiding this comment.
Code Review Agent Run #77fe3f
Actionable Suggestions - 2
-
superset/mcp_service/chart/tool/update_chart.py - 2
- Missing validation for dataset-only rebind · Line 139-144
- Missing validation in preview path · Line 505-514
Review Details
-
Files reviewed - 3 · Commit Range:
5f83dec..5f83dec- superset/mcp_service/chart/schemas.py
- superset/mcp_service/chart/tool/update_chart.py
- tests/unit_tests/mcp_service/chart/tool/test_update_chart.py
-
Files skipped - 0
-
Tools
- Whispers (Secret Scanner) - ✔︎ Successful
- Detect-secrets (Secret Scanner) - ✔︎ Successful
- MyPy (Static Code Analysis) - ✔︎ Successful
- Astral Ruff (Static Code Analysis) - ✔︎ Successful
Bito Usage Guide
Commands
Type the following command in the pull request comment and save the comment.
-
/review- Manually triggers a full AI review. -
/pause- Pauses automatic reviews on this pull request. -
/resume- Resumes automatic reviews. -
/resolve- Marks all Bito-posted review comments as resolved. -
/abort- Cancels all in-progress reviews.
Refer to the documentation for additional commands.
Configuration
This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.
Documentation & Help
| # Dataset-only update: rebind chart to a different dataset without changing viz | ||
| if request.dataset_id is not None: | ||
| return { | ||
| "datasource_id": request.dataset_id, | ||
| "datasource_type": "table", | ||
| } |
There was a problem hiding this comment.
When dataset_id is provided alone (without config), the code skips validation entirely. The new dataset might not exist, might be inaccessible, or might have incompatible schema. Add a validation call using _validate_update_against_dataset with the new dataset_id before returning the payload at line 140.
Code suggestion
Check the AI-generated fix before applying
--- superset/mcp_service/chart/tool/update_chart.py (lines 139-144) ---
139: # Dataset-only update: rebind chart to a different dataset without changing viz
140: if request.dataset_id is not None:
141: + # Validate the new dataset exists and is accessible before rebinding
142: + validation_error = _validate_update_against_dataset(
143: + None, # parsed_config
144: + {}, # form_data (empty for dataset-only update)
145: + chart,
146: + dataset_id=request.dataset_id,
147: + )
148: + if validation_error is not None:
149: + return validation_error
150: return {
151: "datasource_id": request.dataset_id,
152: "datasource_type": "table",
153: }
Code Review Run #77fe3f
Should Bito avoid suggestions like this for future reviews? (Manage Rules)
- Yes, avoid them
There was a problem hiding this comment.
Fixed in faef804. Added validation for the dataset-only rebind path in both persist and preview flows. The target dataset is now checked for existence via _validate_update_against_dataset before writing or caching. The compile check is intentionally skipped (run_compile_check=False) since there is no new chart config to execute — the existing chart form_data is preserved as-is, only the datasource binding changes.
There was a problem hiding this comment.
The implementation correctly addresses the suggestion by adding the required validation call using _validate_update_against_dataset before the dataset rebind occurs. By passing None for the config and an empty dictionary for the form data, the code ensures that the dataset existence and accessibility are verified without triggering unnecessary compilation checks, which is appropriate for a dataset-only update.
superset/mcp_service/chart/tool/update_chart.py
if request.dataset_id is not None:
# Validate the new dataset exists and is accessible before rebinding
validation_error = _validate_update_against_dataset(
None, # parsed_config
{}, # form_data (empty for dataset-only update)
chart,
dataset_id=request.dataset_id,
)
if validation_error is not None:
return validation_error
| if parsed_config is not None: | ||
| with event_logger.log_context(action="mcp.update_chart.validation"): | ||
| validation_error = _validate_update_against_dataset( | ||
| parsed_config, preview_or_error, chart | ||
| parsed_config, | ||
| preview_or_error, | ||
| chart, | ||
| dataset_id=request.dataset_id, | ||
| ) | ||
| if validation_error is not None: | ||
| return validation_error |
There was a problem hiding this comment.
The preview path (generate_preview=True) has the same validation gap as the persist path. When only dataset_id is provided, validation is skipped. Add a check: if request.dataset_id is not None and parsed_config is None: validation_error = _validate_update_against_dataset(None, {}, chart, dataset_id=request.dataset_id)
Code suggestion
Check the AI-generated fix before applying
--- superset/mcp_service/chart/tool/update_chart.py (lines 505-514) ---
505: # Validate before caching the form_data — same rationale as above.
506: + # Also validate dataset-only updates to catch invalid/inaccessible datasets
507: + if parsed_config is not None or request.dataset_id is not None:
508: if parsed_config is not None:
509: with event_logger.log_context(action="mcp.update_chart.validation"):
510: validation_error = _validate_update_against_dataset(
511: parsed_config,
512: preview_or_error,
513: chart,
514: dataset_id=request.dataset_id,
515: )
516: + else:
517: + # Dataset-only update: validate the new dataset exists
518: + with event_logger.log_context(action="mcp.update_chart.validation"):
519: + validation_error = _validate_update_against_dataset(
520: + None,
521: preview_or_error,
522: chart,
523: dataset_id=request.dataset_id,
524: )
Code Review Run #77fe3f
Should Bito avoid suggestions like this for future reviews? (Manage Rules)
- Yes, avoid them
There was a problem hiding this comment.
Fixed in faef804. Same dataset existence validation is now applied in the preview path when dataset_id is provided without config. The compile check is skipped (run_compile_check=False) for the same reason — no chart config to execute against the new dataset.
There was a problem hiding this comment.
The suggestion to add validation in the preview path for dataset-only updates is appropriate. It ensures that dataset existence and accessibility are verified even when no chart configuration is provided, maintaining consistency with the persist path.
superset/mcp_service/chart/tool/update_chart.py
507: + if parsed_config is not None or request.dataset_id is not None:
508: if parsed_config is not None:
509: with event_logger.log_context(action="mcp.update_chart.validation"):
510: validation_error = _validate_update_against_dataset(
511: parsed_config,
512: preview_or_error,
513: chart,
514: dataset_id=request.dataset_id,
515: )
516: + else:
517: + # Dataset-only update: validate the new dataset exists
518: + with event_logger.log_context(action="mcp.update_chart.validation"):
519: + validation_error = _validate_update_against_dataset(
520: + None,
521: preview_or_error,
522: chart,
523: dataset_id=request.dataset_id,
524: )
When dataset_id is provided without config, the target dataset is now checked for existence before persisting or caching. The compile check is intentionally skipped (run_compile_check=False) since there is no new chart config to execute against the new dataset. Also fixes the DatasetNotAccessible error message to report the correct dataset ID when rebinding to a new dataset.
| # Dataset-only update: rebind chart to a different dataset without changing viz | ||
| if request.dataset_id is not None: | ||
| return { | ||
| "datasource_id": request.dataset_id, | ||
| "datasource_type": "table", | ||
| } |
There was a problem hiding this comment.
Suggestion: In the non-config path, when both dataset_id and chart_name are provided, the function returns early with only datasource fields and silently drops the rename. This creates inconsistent behavior (preview can show the renamed title while save-without-preview won't persist it). Include slice_name in the dataset-only payload when chart_name is provided instead of discarding it. [incomplete implementation]
Severity Level: Major ⚠️
- ⚠️ Save path silently ignores requested chart title rename.
- ⚠️ Preview and saved chart titles become inconsistent for users.
- ⚠️ Dataset rebind plus rename workflow behaves unpredictably.Steps of Reproduction ✅
1. Call the MCP `update_chart` tool in
`superset/mcp_service/chart/tool/update_chart.py:339-400` with `generate_preview=True`,
omitting `config` so `parsed_config` is `None`, but providing both `dataset_id` (to
rebind) and `chart_name` (to rename), matching the documented dataset+name update use case
in the PR description.
2. In this preview path, `_build_preview_form_data` at
`superset/mcp_service/chart/tool/update_chart.py:152-200` loads `existing_form_data`,
computes `effective_dataset_id`, and because `parsed_config is None` but
`request.dataset_id` is not `None`, it skips `_missing_config_or_name_error` at lines
187-188, copies `existing_form_data` into `merged`, then sets `merged["slice_name"] =
request.chart_name` at lines 191-192 and `merged["datasource"] =
f"{effective_dataset_id}__table"` at lines 196-198, so the preview URL reflects both the
new dataset and the new title.
3. Next, call `update_chart` again with the same `identifier`, `dataset_id`, and
`chart_name` but with `generate_preview=False` (the "persist immediately" mode described
in the docstring around lines 374-381 of `update_chart.py`), which routes execution into
the non-preview save path at lines 72-119.
4. In this save path, `_build_update_payload` at
`superset/mcp_service/chart/tool/update_chart.py:98-149` sees `parsed_config is None` and
skips the config branch, then hits the "Dataset-only update" block at lines 139-144: `if
request.dataset_id is not None: return {"datasource_id": request.dataset_id,
"datasource_type": "table"}`. This early return ignores `request.chart_name`, so
`UpdateChartCommand` (invoked at lines 112-114) updates only
`datasource_id`/`datasource_type` and leaves `slice_name` unchanged, causing the rename to
be silently dropped even though the earlier preview showed the new title.Fix in Cursor | Fix in VSCode Claude
(Use Cmd/Ctrl + Click for best experience)
Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** superset/mcp_service/chart/tool/update_chart.py
**Line:** 139:144
**Comment:**
*Incomplete Implementation: In the non-config path, when both `dataset_id` and `chart_name` are provided, the function returns early with only datasource fields and silently drops the rename. This creates inconsistent behavior (preview can show the renamed title while save-without-preview won't persist it). Include `slice_name` in the dataset-only payload when `chart_name` is provided instead of discarding it.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix| if dataset_id is not None: | ||
| dataset = DatasetDAO.find_by_id(dataset_id) |
There was a problem hiding this comment.
Suggestion: The new dataset lookup only checks existence and skips dataset-level authorization, so a user can pass a dataset_id they cannot access and still run validation against it. This can leak whether the dataset exists and potentially schema-derived validation details before the write path permission checks run. Add an explicit access check (same pattern used in other chart MCP tools) immediately after resolving dataset_id and treat unauthorized datasets as inaccessible. [security]
Severity Level: Critical 🚨
- ❌ update_chart tool can probe unauthorized dataset existence.
- ⚠️ Validation errors may leak schema or compile details.
- ⚠️ Live compile runs against datasets before access checks.Steps of Reproduction ✅
1. Invoke the MCP `update_chart` tool defined in
`superset/mcp_service/chart/tool/update_chart.py:339-400` with a request containing an
accessible `identifier` (existing chart) plus a `dataset_id` pointing to a different
dataset the caller cannot access, and a valid `config` payload so `parsed_config` is
non-null.
2. Inside `update_chart`, the chart is loaded and the existing datasource is checked via
`check_chart_data_access(chart)` at
`superset/mcp_service/chart/tool/update_chart.py:39-60`, which validates only the chart's
current dataset, not the requested `dataset_id`, so the request continues even if the
target dataset is unauthorized.
3. In the non-preview path (`generate_preview=False`), `update_chart` builds
`payload_or_error` and then calls `_validate_update_against_dataset(parsed_config,
new_form_data, chart, dataset_id=request.dataset_id)` at
`superset/mcp_service/chart/tool/update_chart.py:87-94`, passing the untrusted
`dataset_id` straight into validation.
4. `_validate_update_against_dataset` in
`superset/mcp_service/chart/tool/update_chart.py:203-249` executes `dataset =
DatasetDAO.find_by_id(dataset_id)` at lines 221-222 with no permission check, then calls
`validate_and_compile(parsed_config, form_data, dataset,
run_compile_check=run_compile_check)` at lines 247-249 and returns either success or
detailed validation errors in a `GenerateChartResponse`. Because the write path's access
control (in `superset/commands/chart/update.py:144-151` using `get_datasource_by_id` and
`security_manager.raise_for_access`) runs later, callers can probe which `dataset_id`
values exist and observe schema-derived validation behavior on datasets they are not
authorized to access.Fix in Cursor | Fix in VSCode Claude
(Use Cmd/Ctrl + Click for best experience)
Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** superset/mcp_service/chart/tool/update_chart.py
**Line:** 221:222
**Comment:**
*Security: The new dataset lookup only checks existence and skips dataset-level authorization, so a user can pass a `dataset_id` they cannot access and still run validation against it. This can leak whether the dataset exists and potentially schema-derived validation details before the write path permission checks run. Add an explicit access check (same pattern used in other chart MCP tools) immediately after resolving `dataset_id` and treat unauthorized datasets as inaccessible.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
✅ Deploy Preview for superset-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Code Review Agent Run #b54a9dActionable Suggestions - 0Additional Suggestions - 2
Filtered by Review RulesBito filtered these suggestions based on rules created automatically for your feedback. Manage rules.
Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
Summary
UpdateChartRequestlacked adataset_idfield, so Pydantic silently dropped the parameter whenever the LLM passed it_build_update_payloadalways fell back tochart.datasource_idand never includeddatasource_id/datasource_typein theUpdateChartCommandpayload — the tool returnedsuccess: truewhile the chart stayed bound to its original datasetdataset_id: int | NonetoUpdateChartRequestand plumbed it through_build_update_payload,_build_preview_form_data,_validate_update_against_dataset, and_create_preview_urlTest plan
TestBuildUpdatePayloadDatasetId— 4 unit tests covering dataset-only, dataset+name, dataset+config, and config-without-dataset payloadsTestBuildPreviewFormDataDatasetId— 3 unit tests verifying thedatasourcefield is set correctly in preview form_dataTestUpdateChartDatasetIdIntegration— async integration test verifyingUpdateChartCommandreceivesdatasource_idanddatasource_typein payload