feat(api): extending uuid field in dataset, chart, dasboard APIs #29573

dankor · 2024-07-12T10:03:20Z

SUMMARY

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

Extend API to handle uuid

TESTING INSTRUCTIONS

GET http://localhost:8088/api/v1/dataset/?q={"columns": ["uuid"]}
GET http://localhost:8088/api/v1/chart/?q={"columns": ["uuid"]}
GET http://localhost:8088/api/v1/dashboard/?q={"columns": ["uuid"]}
Plus, PUT and POST.

ADDITIONAL INFORMATION

Has associated issue:
Required feature flags:
Changes UI
Includes DB Migration (follow approval process in SIP-59)
- Migration is atomic, supports rollback & is backwards-compatible
- Confirm DB migration upgrade and downgrade tested
- Runtime estimates and downtime expectations provided
Introduces new feature or API
Removes existing feature or API

github-actions

Congrats on making your first PR and thank you for contributing to Superset! 🎉 ❤️

We hope to see you in our Slack community too! Not signed up? Use our Slack App to self-register.

codecov · 2024-07-12T16:09:08Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 64.99%. Comparing base (76d897e) to head (9c71763).
Report is 500 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #29573      +/-   ##
==========================================
+ Coverage   60.48%   64.99%   +4.50%     
==========================================
  Files        1931      526    -1405     
  Lines       76236    37928   -38308     
  Branches     8568        0    -8568     
==========================================
- Hits        46114    24651   -21463     
+ Misses      28017    13277   -14740     
+ Partials     2105        0    -2105

Flag	Coverage Δ
hive	`49.12% <100.00%> (-0.05%)`	⬇️
javascript	`?`
presto	`53.68% <100.00%> (-0.13%)`	⬇️
python	`64.99% <100.00%> (+1.50%)`	⬆️
unit	`59.90% <100.00%> (+2.28%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

michael-s-molina · 2024-07-12T17:23:29Z

superset/datasets/api.py

@@ -123,6 +125,36 @@ class DatasetRestApi(BaseSupersetModelRestApi):
        "schema",
        "sql",
        "table_name",
+        "always_filter_main_dttm",


@dpgaspar won't the payload critically increase with this change?

We have datasets with thousands of metrics and columns.

If it's only blocker, I don't mind to revert that. Perhaps it makes more sense to introduce a parallel API (v2) and catch up the front-end later. Please let me know, no rush here.

Could potentially generate huge payloads like @michael-s-molina is saying

betodealmeida · 2024-07-12T17:23:52Z

Hi, @dankor! Welcome to the Superset GitHub!

Can you add some context on why these fields are needed? Thanks!

dankor · 2024-07-13T15:06:28Z

Hi, @dankor! Welcome to the Superset GitHub!

Can you add some context on why these fields are needed? Thanks!

Hi @betodealmeida! Sure! I'm working on establishing a robust and granular deployment process for Superset. The core issue here is similar to why we expose these UUIDs in the import/export API: we need persistent identifiers across multiple Superset instances.

You might suggest extending the import/export API instead, but we’re encountering several challenges:

No support for tags, roles, etc.
Deprecated resources aren’t cleaned up.
Archives are much harder to debug compared to classic CRUD operations.

What I’m really aiming for is a declarative way to manage Superset resources (let’s call it a "report-as-code" approach). Imagine having a development environment where everyone can experiment freely. Once a report is ready, it can be committed to git, reviewed, and then deployed to higher environments.

Ideally, I envision mature version of this, but supporting all types of resources. You can check out a proof-of-concept in this video, though no code is available.

My plan is to implement this feature incrementally, starting with minor changes and aligning as closely as possible with upstream.

Hope this makes sense! Let me know if you're open to these kinds of contributions and what concerns the community might have.

dankor · 2024-07-29T07:37:39Z

Hey all! I'm just wondering if there are any upcoming plans or next steps here. Could you please let me what to expect?

rusackas · 2024-07-29T17:21:42Z

It looks like CI is stuck because of some tests failing (which I'm re-running), and pre-commit checks (i.e. linting) so you will need to run the Pre-commit hooks

Superset uses Git pre-commit hooks courtesy of pre-commit. To install run the following:

pip3 install -r requirements/integration.txt
pre-commit install
A series of checks will now run when you make a git commit.

You can then run pre-commit manually:

pre-commit run --all-files

michael-s-molina

list_columns is used by the frontend to load the datasets/charts/dashboards pages. Projecting all column/metric information will significantly impact performance for environments that have thousands of columns/metrics. If you want to query for that information you can request them using the q parameter.

dankor · 2024-07-29T19:10:14Z

Hi @rusackas ! Thanks! I fixed the linting.

@michael-s-molina Fair point. Actually, I have added this fields in order to q them:

{
  "columns": [
    "always_filter_main_dttm",
    "cache_timeout",
    "catalog",
    "columns.advanced_data_type",
    "columns.column_name",
    "columns.description",
    "columns.expression",
    "columns.extra",
    "columns.filterable",
    "columns.groupby",
    "columns.is_active",
    "columns.is_dttm",
    "columns.python_date_format",
    "columns.type",
    "columns.verbose_name",
    "default_endpoint",
    "description",
    "extra",
    "fetch_values_predicate",
    "filter_select_enabled",
    "is_managed_externally",
    "is_sqllab_view",
    "main_dttm_col",
    "metrics.d3format",
    "metrics.description",
    "metrics.expression",
    "metrics.extra",
    "metrics.metric_name",
    "metrics.metric_type",
    "metrics.verbose_name",
    "metrics.warning_text",
    "normalize_columns",
    "offset",
    "schema",
    "sql",
    "table_name",
    "template_params",
    "uuid",
    "database.uuid"
  ],
  "filters": [
    {
      "col": "uuid",
      "opr": "eq",
      "value": "7d2a123a-1952-46e1-b9d0-4d41e1e042c1"
    }
  ]
}

Otherwise these fields won't be listed with q.

Do you suppose to keep list_columns unchanged and perform two requests instead?

Run list_columns with uuid filter and get id.
Run show_select_columns against obtained id with all required fields.

I'd tend to have one request if possible, but I assume the front-end changes may be too big, right?

michael-s-molina · 2024-07-29T19:18:30Z

I'd tend to have one request if possible, but I assume the front-end changes may be too big, right?

You would need to change the requests for all pages but I'm also worried that the default implementation of the endpoint will perform these joins (columns and metrics belong to separate tables).

@dpgaspar Is there a way we can achieve this without multiple requests?

dankor · 2024-08-08T08:51:10Z

I'm wondering if we can come up with some action plan here. It currently seems quite frontend-oriented. I'd like to find a common ground in order to both fulfill my use-case and contribute it upstream, so community can benefit too. What's your vision here?

rusackas · 2024-08-15T16:12:03Z

I think the main thing is that we can't impact performance. @dpgaspar is currently out of office, but may be able to provide suggestions upon his return of how to handle this at the API layer.

I'm also wondering if this might be a precursor to a (fairly common) request to change URLs from being incremental integer IDs to UUIDs (e.g. here)

It might also tie into this issue.

Maybe we can just solve the performance concern and let this through, but at some point we need to have a SIP proposed about how/where UUIDs should be used more systemically by Superset. I think that's the real path forward - broadening the discussion to formulate a plan to use UUIDs everywhere relevant.

dpgaspar · 2024-08-19T15:01:50Z

I'd tend to have one request if possible, but I assume the front-end changes may be too big, right?

You would need to change the requests for all pages but I'm also worried that the default implementation of the endpoint will perform these joins (columns and metrics belong to separate tables).

@dpgaspar Is there a way we can achieve this without multiple requests?

If we include columns and metrics on a dataset payload, these results won't be paginated. So I advise making multiple requests

dankor · 2024-08-26T17:28:03Z

Hey @betodealmeida @rusackas @michael-s-molina @dpgaspar !

Could you please take another look? I've reworked this PR to eliminate the need for listing calls when looking up a single item by UUID. I was inspired by the dashboard GET approach, where the dashboard API supports three different PKs: ID, slug, and UUID. I've only implemented this for the chart so far, to get your feedback on the new approach. I hope it makes more sense now. Looking forward to hearing from you.

Added uuid field to dataset, chart, dashboard API

7d2a9b5

pull-request-size bot added the size/M label Jul 12, 2024

github-actions bot added the api Related to the REST API label Jul 12, 2024

dosubot bot added api:charts Related to the REST endpoints of charts api:dashboard Related to the REST endpoints of the Dashboard labels Jul 12, 2024

dankor mentioned this pull request Jul 12, 2024

Extend API to handle uuid #29572

Open

3 tasks

github-actions bot reviewed Jul 12, 2024

View reviewed changes

michael-s-molina requested a review from dpgaspar July 12, 2024 17:20

michael-s-molina reviewed Jul 12, 2024

View reviewed changes

michael-s-molina added the review:checkpoint Last PR reviewed during the daily review standup label Jul 12, 2024

michael-s-molina removed the review:checkpoint Last PR reviewed during the daily review standup label Jul 15, 2024

michael-s-molina requested changes Jul 29, 2024

View reviewed changes

lint fix

9c71763

rusackas mentioned this pull request Aug 15, 2024

The default URL generated by Dashboard has security risks！ #29879

Open

dankor closed this Aug 25, 2024

dankor force-pushed the master branch from 9c71763 to 5bebcf6 Compare August 25, 2024 18:53

pull-request-size bot added size/XS and removed size/M labels Aug 25, 2024

Danylo Korostil added 2 commits August 25, 2024 22:06

Merge branch 'master' of github.com:dankor/superset

867afa0

support both id or uuid in slice api pk

6f7efbf

dankor reopened this Aug 26, 2024

pull-request-size bot added size/L and removed size/XS labels Aug 26, 2024

revert dataset list api columns

a5db327

dankor requested a review from michael-s-molina September 3, 2024 07:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(api): extending uuid field in dataset, chart, dasboard APIs #29573

feat(api): extending uuid field in dataset, chart, dasboard APIs #29573

dankor commented Jul 12, 2024 •

edited

Loading

github-actions bot left a comment

codecov bot commented Jul 12, 2024 •

edited

Loading

michael-s-molina Jul 12, 2024

michael-s-molina Jul 12, 2024

dankor Aug 12, 2024

dpgaspar Aug 19, 2024

betodealmeida commented Jul 12, 2024

dankor commented Jul 13, 2024 •

edited

Loading

dankor commented Jul 29, 2024

rusackas commented Jul 29, 2024

michael-s-molina left a comment

dankor commented Jul 29, 2024 •

edited

Loading

michael-s-molina commented Jul 29, 2024 •

edited

Loading

dankor commented Aug 8, 2024

rusackas commented Aug 15, 2024

dpgaspar commented Aug 19, 2024

dankor commented Aug 26, 2024

feat(api): extending uuid field in dataset, chart, dasboard APIs #29573

Are you sure you want to change the base?

feat(api): extending uuid field in dataset, chart, dasboard APIs #29573

Conversation

dankor commented Jul 12, 2024 • edited Loading

SUMMARY

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

github-actions bot left a comment

Choose a reason for hiding this comment

codecov bot commented Jul 12, 2024 • edited Loading

Codecov Report

michael-s-molina Jul 12, 2024

Choose a reason for hiding this comment

michael-s-molina Jul 12, 2024

Choose a reason for hiding this comment

dankor Aug 12, 2024

Choose a reason for hiding this comment

dpgaspar Aug 19, 2024

Choose a reason for hiding this comment

betodealmeida commented Jul 12, 2024

dankor commented Jul 13, 2024 • edited Loading

dankor commented Jul 29, 2024

rusackas commented Jul 29, 2024

michael-s-molina left a comment

Choose a reason for hiding this comment

dankor commented Jul 29, 2024 • edited Loading

michael-s-molina commented Jul 29, 2024 • edited Loading

dankor commented Aug 8, 2024

rusackas commented Aug 15, 2024

dpgaspar commented Aug 19, 2024

dankor commented Aug 26, 2024

dankor commented Jul 12, 2024 •

edited

Loading

codecov bot commented Jul 12, 2024 •

edited

Loading

dankor commented Jul 13, 2024 •

edited

Loading

dankor commented Jul 29, 2024 •

edited

Loading

michael-s-molina commented Jul 29, 2024 •

edited

Loading