Skip to content

fix(mcp): add dataset_id to update_chart to support rebinding chart datasource#40853

Open
aminghadersohi wants to merge 2 commits into
apache:masterfrom
aminghadersohi:aminghadersohi/fix-update-chart-dataset-id
Open

fix(mcp): add dataset_id to update_chart to support rebinding chart datasource#40853
aminghadersohi wants to merge 2 commits into
apache:masterfrom
aminghadersohi:aminghadersohi/fix-update-chart-dataset-id

Conversation

@aminghadersohi

Copy link
Copy Markdown
Contributor

Summary

  • UpdateChartRequest lacked a dataset_id field, so Pydantic silently dropped the parameter whenever the LLM passed it
  • _build_update_payload always fell back to chart.datasource_id and never included datasource_id/datasource_type in the UpdateChartCommand payload — the tool returned success: true while the chart stayed bound to its original dataset
  • Added dataset_id: int | None to UpdateChartRequest and plumbed it through _build_update_payload, _build_preview_form_data, _validate_update_against_dataset, and _create_preview_url
  • Also enables dataset-only updates (rebind without config or name change)

Test plan

  • TestBuildUpdatePayloadDatasetId — 4 unit tests covering dataset-only, dataset+name, dataset+config, and config-without-dataset payloads
  • TestBuildPreviewFormDataDatasetId — 3 unit tests verifying the datasource field is set correctly in preview form_data
  • TestUpdateChartDatasetIdIntegration — async integration test verifying UpdateChartCommand receives datasource_id and datasource_type in payload

…atasource

UpdateChartRequest lacked a dataset_id field, so Pydantic silently dropped
the parameter whenever the LLM passed it. _build_update_payload always fell
back to chart.datasource_id and never included datasource_id/datasource_type
in the UpdateChartCommand payload, causing the tool to return success: true
while the chart stayed bound to its original dataset.

Added dataset_id: int | None to UpdateChartRequest and plumbed it through
_build_update_payload (includes datasource_id/datasource_type in payload),
_build_preview_form_data (updates datasource field), _validate_update_against_dataset
(validates against the new dataset when provided), and _create_preview_url
(uses new datasource for form_data caching). Also allows dataset-only updates
without requiring config or chart_name.
@codecov

codecov Bot commented Jun 8, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 3.03030% with 32 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.52%. Comparing base (7c7ab88) to head (faef804).
⚠️ Report is 70 commits behind head on master.

Files with missing lines Patch % Lines
superset/mcp_service/chart/tool/update_chart.py 0.00% 32 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #40853      +/-   ##
==========================================
- Coverage   64.14%   63.52%   -0.63%     
==========================================
  Files        2652     2653       +1     
  Lines      143488   143766     +278     
  Branches    33110    33161      +51     
==========================================
- Hits        92042    91324     -718     
- Misses      49837    50824     +987     
- Partials     1609     1618       +9     
Flag Coverage Δ
hive 39.46% <3.03%> (-0.06%) ⬇️
mysql 58.20% <3.03%> (-0.08%) ⬇️
postgres 58.26% <3.03%> (-0.08%) ⬇️
presto 41.05% <3.03%> (-0.07%) ⬇️
python 58.49% <3.03%> (-1.33%) ⬇️
sqlite 57.89% <3.03%> (-0.08%) ⬇️
unit ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@aminghadersohi aminghadersohi marked this pull request as ready for review June 9, 2026 16:10
@dosubot dosubot Bot added the change:backend Requires changing the backend label Jun 9, 2026
@rebenitez1802

Copy link
Copy Markdown
Contributor

From my agent reviewer:

Overall: solid fix, security model intact. One real bug to fix before merge.

🔴 High — Rename silently dropped
In _build_update_payload, the dataset-only branch returns {datasource_id, datasource_type} and returns before the name branch — so dataset_id + chart_name (no config) on the persist path loses the rename. The preview path does apply it → preview/persist divergence. Test test_dataset_and_name_update codifies the bug. Fix: add slice_name to the payload when chart_name is set; update the test.

🟡 Medium — Dataset-only rebind skips validation
_validate_update_against_dataset runs only when config is present, so a rebind keeps old params (old dataset’s columns/metrics) pointed at the new dataset — chart can break silently with no warning.

🟢 Low — datasource_type hardcoded to "table"
Fine (all MCP datasets are SqlaTable), but add a comment noting non-table datasources aren’t supported.

🟢 Low — DRY
effective_dataset_id block duplicated in _build_update_payload and _build_preview_form_data; extract a helper.

🟢 Low — Test gaps
No test for dataset_id+chart_name persist asserting rename; no negative test for rebinding to an inaccessible dataset (the security-relevant case).

@bito-code-review bito-code-review Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Agent Run #77fe3f

Actionable Suggestions - 2
  • superset/mcp_service/chart/tool/update_chart.py - 2
Review Details
  • Files reviewed - 3 · Commit Range: 5f83dec..5f83dec
    • superset/mcp_service/chart/schemas.py
    • superset/mcp_service/chart/tool/update_chart.py
    • tests/unit_tests/mcp_service/chart/tool/test_update_chart.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

Comment on lines +139 to +144
# Dataset-only update: rebind chart to a different dataset without changing viz
if request.dataset_id is not None:
return {
"datasource_id": request.dataset_id,
"datasource_type": "table",
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing validation for dataset-only rebind

When dataset_id is provided alone (without config), the code skips validation entirely. The new dataset might not exist, might be inaccessible, or might have incompatible schema. Add a validation call using _validate_update_against_dataset with the new dataset_id before returning the payload at line 140.

Code suggestion
Check the AI-generated fix before applying
 --- superset/mcp_service/chart/tool/update_chart.py (lines 139-144) ---
 139:     # Dataset-only update: rebind chart to a different dataset without changing viz
 140:     if request.dataset_id is not None:
 141: +        # Validate the new dataset exists and is accessible before rebinding
 142: +        validation_error = _validate_update_against_dataset(
 143: +            None,  # parsed_config
 144: +            {},    # form_data (empty for dataset-only update)
 145: +            chart,
 146: +            dataset_id=request.dataset_id,
 147: +        )
 148: +        if validation_error is not None:
 149: +            return validation_error
 150:         return {
 151:             "datasource_id": request.dataset_id,
 152:             "datasource_type": "table",
 153:         }

Code Review Run #77fe3f


Should Bito avoid suggestions like this for future reviews? (Manage Rules)

  • Yes, avoid them

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in faef804. Added validation for the dataset-only rebind path in both persist and preview flows. The target dataset is now checked for existence via _validate_update_against_dataset before writing or caching. The compile check is intentionally skipped (run_compile_check=False) since there is no new chart config to execute — the existing chart form_data is preserved as-is, only the datasource binding changes.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation correctly addresses the suggestion by adding the required validation call using _validate_update_against_dataset before the dataset rebind occurs. By passing None for the config and an empty dictionary for the form data, the code ensures that the dataset existence and accessibility are verified without triggering unnecessary compilation checks, which is appropriate for a dataset-only update.

superset/mcp_service/chart/tool/update_chart.py

if request.dataset_id is not None:
        # Validate the new dataset exists and is accessible before rebinding
        validation_error = _validate_update_against_dataset(
            None,  # parsed_config
            {},    # form_data (empty for dataset-only update)
            chart,
            dataset_id=request.dataset_id,
        )
        if validation_error is not None:
            return validation_error

Comment on lines 505 to 514
if parsed_config is not None:
with event_logger.log_context(action="mcp.update_chart.validation"):
validation_error = _validate_update_against_dataset(
parsed_config, preview_or_error, chart
parsed_config,
preview_or_error,
chart,
dataset_id=request.dataset_id,
)
if validation_error is not None:
return validation_error

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing validation in preview path

The preview path (generate_preview=True) has the same validation gap as the persist path. When only dataset_id is provided, validation is skipped. Add a check: if request.dataset_id is not None and parsed_config is None: validation_error = _validate_update_against_dataset(None, {}, chart, dataset_id=request.dataset_id)

Code suggestion
Check the AI-generated fix before applying
 --- superset/mcp_service/chart/tool/update_chart.py (lines 505-514) ---
 505:             # Validate before caching the form_data — same rationale as above.
 506: +            # Also validate dataset-only updates to catch invalid/inaccessible datasets
 507: +            if parsed_config is not None or request.dataset_id is not None:
 508:                 if parsed_config is not None:
 509:                     with event_logger.log_context(action="mcp.update_chart.validation"):
 510:                         validation_error = _validate_update_against_dataset(
 511:                             parsed_config,
 512:                             preview_or_error,
 513:                             chart,
 514:                             dataset_id=request.dataset_id,
 515:                         )
 516: +                else:
 517: +                    # Dataset-only update: validate the new dataset exists
 518: +                    with event_logger.log_context(action="mcp.update_chart.validation"):
 519: +                        validation_error = _validate_update_against_dataset(
 520: +                            None,
 521:                             preview_or_error,
 522:                             chart,
 523:                             dataset_id=request.dataset_id,
 524:                         )

Code Review Run #77fe3f


Should Bito avoid suggestions like this for future reviews? (Manage Rules)

  • Yes, avoid them

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in faef804. Same dataset existence validation is now applied in the preview path when dataset_id is provided without config. The compile check is skipped (run_compile_check=False) for the same reason — no chart config to execute against the new dataset.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The suggestion to add validation in the preview path for dataset-only updates is appropriate. It ensures that dataset existence and accessibility are verified even when no chart configuration is provided, maintaining consistency with the persist path.

superset/mcp_service/chart/tool/update_chart.py

507: +            if parsed_config is not None or request.dataset_id is not None:
 508:                 if parsed_config is not None:
 509:                     with event_logger.log_context(action="mcp.update_chart.validation"):
 510:                         validation_error = _validate_update_against_dataset(
 511:                             parsed_config,
 512:                             preview_or_error,
 513:                             chart,
 514:                             dataset_id=request.dataset_id,
 515:                         )
 516: +                else:
 517: +                    # Dataset-only update: validate the new dataset exists
 518: +                    with event_logger.log_context(action="mcp.update_chart.validation"):
 519: +                        validation_error = _validate_update_against_dataset(
 520: +                            None,
 521:                             preview_or_error,
 522:                             chart,
 523:                             dataset_id=request.dataset_id,
 524:                         )

When dataset_id is provided without config, the target dataset is now
checked for existence before persisting or caching. The compile check
is intentionally skipped (run_compile_check=False) since there is no
new chart config to execute against the new dataset.

Also fixes the DatasetNotAccessible error message to report the
correct dataset ID when rebinding to a new dataset.
Comment on lines +139 to +144
# Dataset-only update: rebind chart to a different dataset without changing viz
if request.dataset_id is not None:
return {
"datasource_id": request.dataset_id,
"datasource_type": "table",
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: In the non-config path, when both dataset_id and chart_name are provided, the function returns early with only datasource fields and silently drops the rename. This creates inconsistent behavior (preview can show the renamed title while save-without-preview won't persist it). Include slice_name in the dataset-only payload when chart_name is provided instead of discarding it. [incomplete implementation]

Severity Level: Major ⚠️
- ⚠️ Save path silently ignores requested chart title rename.
- ⚠️ Preview and saved chart titles become inconsistent for users.
- ⚠️ Dataset rebind plus rename workflow behaves unpredictably.
Steps of Reproduction ✅
1. Call the MCP `update_chart` tool in
`superset/mcp_service/chart/tool/update_chart.py:339-400` with `generate_preview=True`,
omitting `config` so `parsed_config` is `None`, but providing both `dataset_id` (to
rebind) and `chart_name` (to rename), matching the documented dataset+name update use case
in the PR description.

2. In this preview path, `_build_preview_form_data` at
`superset/mcp_service/chart/tool/update_chart.py:152-200` loads `existing_form_data`,
computes `effective_dataset_id`, and because `parsed_config is None` but
`request.dataset_id` is not `None`, it skips `_missing_config_or_name_error` at lines
187-188, copies `existing_form_data` into `merged`, then sets `merged["slice_name"] =
request.chart_name` at lines 191-192 and `merged["datasource"] =
f"{effective_dataset_id}__table"` at lines 196-198, so the preview URL reflects both the
new dataset and the new title.

3. Next, call `update_chart` again with the same `identifier`, `dataset_id`, and
`chart_name` but with `generate_preview=False` (the "persist immediately" mode described
in the docstring around lines 374-381 of `update_chart.py`), which routes execution into
the non-preview save path at lines 72-119.

4. In this save path, `_build_update_payload` at
`superset/mcp_service/chart/tool/update_chart.py:98-149` sees `parsed_config is None` and
skips the config branch, then hits the "Dataset-only update" block at lines 139-144: `if
request.dataset_id is not None: return {"datasource_id": request.dataset_id,
"datasource_type": "table"}`. This early return ignores `request.chart_name`, so
`UpdateChartCommand` (invoked at lines 112-114) updates only
`datasource_id`/`datasource_type` and leaves `slice_name` unchanged, causing the rename to
be silently dropped even though the earlier preview showed the new title.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** superset/mcp_service/chart/tool/update_chart.py
**Line:** 139:144
**Comment:**
	*Incomplete Implementation: In the non-config path, when both `dataset_id` and `chart_name` are provided, the function returns early with only datasource fields and silently drops the rename. This creates inconsistent behavior (preview can show the renamed title while save-without-preview won't persist it). Include `slice_name` in the dataset-only payload when `chart_name` is provided instead of discarding it.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

Comment on lines +221 to +222
if dataset_id is not None:
dataset = DatasetDAO.find_by_id(dataset_id)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: The new dataset lookup only checks existence and skips dataset-level authorization, so a user can pass a dataset_id they cannot access and still run validation against it. This can leak whether the dataset exists and potentially schema-derived validation details before the write path permission checks run. Add an explicit access check (same pattern used in other chart MCP tools) immediately after resolving dataset_id and treat unauthorized datasets as inaccessible. [security]

Severity Level: Critical 🚨
- ❌ update_chart tool can probe unauthorized dataset existence.
- ⚠️ Validation errors may leak schema or compile details.
- ⚠️ Live compile runs against datasets before access checks.
Steps of Reproduction ✅
1. Invoke the MCP `update_chart` tool defined in
`superset/mcp_service/chart/tool/update_chart.py:339-400` with a request containing an
accessible `identifier` (existing chart) plus a `dataset_id` pointing to a different
dataset the caller cannot access, and a valid `config` payload so `parsed_config` is
non-null.

2. Inside `update_chart`, the chart is loaded and the existing datasource is checked via
`check_chart_data_access(chart)` at
`superset/mcp_service/chart/tool/update_chart.py:39-60`, which validates only the chart's
current dataset, not the requested `dataset_id`, so the request continues even if the
target dataset is unauthorized.

3. In the non-preview path (`generate_preview=False`), `update_chart` builds
`payload_or_error` and then calls `_validate_update_against_dataset(parsed_config,
new_form_data, chart, dataset_id=request.dataset_id)` at
`superset/mcp_service/chart/tool/update_chart.py:87-94`, passing the untrusted
`dataset_id` straight into validation.

4. `_validate_update_against_dataset` in
`superset/mcp_service/chart/tool/update_chart.py:203-249` executes `dataset =
DatasetDAO.find_by_id(dataset_id)` at lines 221-222 with no permission check, then calls
`validate_and_compile(parsed_config, form_data, dataset,
run_compile_check=run_compile_check)` at lines 247-249 and returns either success or
detailed validation errors in a `GenerateChartResponse`. Because the write path's access
control (in `superset/commands/chart/update.py:144-151` using `get_datasource_by_id` and
`security_manager.raise_for_access`) runs later, callers can probe which `dataset_id`
values exist and observe schema-derived validation behavior on datasets they are not
authorized to access.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** superset/mcp_service/chart/tool/update_chart.py
**Line:** 221:222
**Comment:**
	*Security: The new dataset lookup only checks existence and skips dataset-level authorization, so a user can pass a `dataset_id` they cannot access and still run validation against it. This can leak whether the dataset exists and potentially schema-derived validation details before the write path permission checks run. Add an explicit access check (same pattern used in other chart MCP tools) immediately after resolving `dataset_id` and treat unauthorized datasets as inaccessible.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

@netlify

netlify Bot commented Jun 10, 2026

Copy link
Copy Markdown

Deploy Preview for superset-docs-preview ready!

Name Link
🔨 Latest commit faef804
🔍 Latest deploy log https://app.netlify.com/projects/superset-docs-preview/deploys/6a298dfb4f3ef800087f98f3
😎 Deploy Preview https://deploy-preview-40853--superset-docs-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
🤖 Make changes Run an agent on this branch

To edit notification comments on pull requests, go to your Netlify project configuration.

@bito-code-review

bito-code-review Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Code Review Agent Run #b54a9d

Actionable Suggestions - 0
Additional Suggestions - 2
  • superset/mcp_service/chart/tool/update_chart.py - 2
    • Missing success-path unit test · Line 496-509
      The new dataset-only rebind code path (lines 496-509) lacks a success-path test. Per BITO.md rule [11730], comprehensive unit tests should cover both success and error scenarios. Add a test that sends a valid `dataset_id` without config and verifies the chart is persisted successfully.
    • Missing success-path unit test · Line 534-546
      The new dataset-only rebind code path (lines 534-546) lacks a success-path test. Per BITO.md rule [11730], comprehensive unit tests should cover both success and error scenarios. Add a test that sends a valid `dataset_id` without config and verifies the preview URL is returned.
Filtered by Review Rules

Bito filtered these suggestions based on rules created automatically for your feedback. Manage rules.

  • tests/unit_tests/mcp_service/chart/tool/test_update_chart.py - 2
Review Details
  • Files reviewed - 2 · Commit Range: 5f83dec..faef804
    • superset/mcp_service/chart/tool/update_chart.py
    • tests/unit_tests/mcp_service/chart/tool/test_update_chart.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

change:backend Requires changing the backend size/L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants