Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Chart cache-warmup task fails on Superset 4.0 #28706

Merged
merged 5 commits into from
Sep 7, 2024

Conversation

rmasters
Copy link
Contributor

@rmasters rmasters commented May 25, 2024

SUMMARY

Fixes #28705

We found our chart cache warm-up task (as in the example in the Kubernetes docs) was failing with this error:

[2024-05-24 22:30:00,045: INFO/ForkPoolWorker-1] cache-warmup[ae85f28c-acfa-4bb1-a885-e7f0121610b9]: Loading strategy
[2024-05-24 22:30:00,045: INFO/ForkPoolWorker-1] cache-warmup[ae85f28c-acfa-4bb1-a885-e7f0121610b9]: Loading TopNDashboardsStrategy
[2024-05-24 22:30:00,046: INFO/ForkPoolWorker-1] cache-warmup[ae85f28c-acfa-4bb1-a885-e7f0121610b9]: Success!
HTTPException
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 1823, in full_dispatch_request
    rv = self.dispatch_request()
File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 1788, in dispatch_request
    self.raise_routing_exception(req)
File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 1770, in raise_routing_exception
    raise request.routing_exception # type: ignore
File "/usr/local/lib/python3.10/site-packages/flask/ctx.py", line 351, in match_request
    result = self.url_adapter.match(return_rule=True) # type: ignore
File "/usr/local/lib/python3.10/site-packages/werkzeug/routing/map.py", line 619, in match
    raise MethodNotAllowed(valid_methods=list(e.have_match_for)) from None
werkzeug.exceptions.MethodNotAllowed: 405 Method Not Allowed: The method is not allowed for the requested URL.
[repeated]

On investigation I found that:

  • The cache-warmup task calls PUT /superset/warm_up_caches/ (via the fetch_url task)
  • /superset/warm_up_caches/ (Superset.warm_up_caches) exists but is a GET request
  • That endpoint is deprecated by /api/v1/chart/warm_up_caches (ChartRestApi.warm_up_cache) (scheduled for deletion in 4.0 but I couldn't find discussions around that)

This PR changes the task to call the newer endpoint (accidentally reverted in 56069b0).

I also ran into CSRF issues (similar to discussion in #27160), so added a method to fetch this (thanks to Will Gan on slack for the hint). Perhaps this can be placed more centrally? This appears to work on installations with CSRF disabled (I've tested with WTF_CSRF_ENABLED = False).

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

  1. Ensure you have a scheduled job to warmup caches, e.g. by adding to CeleryConfig:
     class CeleryConfig:
         broker_url = f"redis://{REDIS_HOST}:{REDIS_PORT}/{REDIS_CELERY_DB}"
    -    imports = ("superset.sql_lab",)
    +    imports = (
    +        "superset.sql_lab",
    +        "superset.tasks.cache",
    +    )
         result_backend = f"redis://{REDIS_HOST}:{REDIS_PORT}/{REDIS_RESULTS_DB}"
         worker_prefetch_multiplier = 1
         task_acks_late = False
    @@ -87,6 +90,13 @@ class CeleryConfig:
                 "task": "reports.prune_log",
                 "schedule": crontab(minute=10, hour=0),
             },
    +        "cache-warmup-dummy": {
    +            "task": "cache-warmup",
    +            "schedule": crontab(minute="*/1", hour="*"),
    +            "kwargs": {
    +                "strategy_name": "dummy",
    +            },
    +        },
         }
  2. If starting from an empty installation, create a dashboard, with at least one data-backed chart, and publish it
  3. Each minute you should see log lines like:
    superset_worker        | [2024-05-25 00:18:03,203: INFO/ForkPoolWorker-1] fetch_url[b9a9c5c6-41e0-491f-a8cb-d49c0b9ab396]: Fetching http://superset_app:8088/api/v1/security/csrf_token/
    superset_app           | 2024-05-25 00:18:03,198:DEBUG:superset.stats_logger:[stats_logger] (timing) ChartRestApi.warm_up_cache.time | 176.14193800545763
    superset_app           | 2024-05-25 00:18:03,200:INFO:werkzeug:192.168.165.7 - - [25/May/2024 00:18:03] "PUT /api/v1/chart/warm_up_cache HTTP/1.1" 200 -
    
    as compared to this on master:
    superset_worker        | [2024-05-25 00:26:00,162: INFO/MainProcess] Task fetch_url[a488cfb6-1d63-42bb-be00-a3e6af3e2c4a] received
    superset_worker        | [2024-05-25 00:26:00,165: INFO/ForkPoolWorker-1] fetch_url[9a054a67-acf0-4fd8-b0dd-42ab28da886b]: Fetching http://superset_app:8088/superset/warm_up_cache/ with payload {"chart_id": 156}
    superset_app           | 2024-05-25 00:26:00,168:WARNING:superset.views.base:HTTPException
    superset_app           | Traceback (most recent call last):
    superset_app           |   File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 1484, in full_dispatch_request
    superset_app           |     rv = self.dispatch_request()
    superset_app           |   File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 1458, in dispatch_request
    superset_app           |     self.raise_routing_exception(req)
    superset_app           |   File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 1440, in raise_routing_exception
    superset_worker        | [2024-05-25 00:26:00,167: INFO/ForkPoolWorker-7] cache-warmup[57b6894a-e9a2-4fba-bf8e-2e7d74826f61]: Scheduling {"chart_id": 156}
    superset_app           |     raise request.routing_exception  # type: ignore
    superset_app           |   File "/usr/local/lib/python3.10/site-packages/flask/ctx.py", line 353, in match_request
    superset_app           |     result = self.url_adapter.match(return_rule=True)  # type: ignore
    superset_app           |   File "/usr/local/lib/python3.10/site-packages/werkzeug/routing/map.py", line 624, in match
    superset_app           |     raise MethodNotAllowed(valid_methods=list(e.have_match_for)) from None
    superset_app           | werkzeug.exceptions.MethodNotAllowed: 405 Method Not Allowed: The method is not allowed for the requested URL.
    superset_app           | 2024-05-25 00:26:00,170:INFO:werkzeug:192.168.165.8 - - [25/May/2024 00:26:00] "PUT /superset/warm_up_cache/ HTTP/1.1" 405 -
    

A clearer example on my clean install (based on 4.0.0):

superset_worker_beat  | [2024-05-24 23:38:00,065: INFO/MainProcess] Scheduler: Sending due task cache-warmup-hourly (cache-warmup)
superset_worker       | [2024-05-24 23:38:00,084: INFO/MainProcess] Task cache-warmup[c8b9a20b-3fea-45e5-a9fc-9a2fdcfa7b40] received
superset_worker       | [2024-05-24 23:38:00,093: INFO/ForkPoolWorker-7] cache-warmup[c8b9a20b-3fea-45e5-a9fc-9a2fdcfa7b40]: Loading strategy
superset_worker       | [2024-05-24 23:38:00,094: INFO/ForkPoolWorker-7] cache-warmup[c8b9a20b-3fea-45e5-a9fc-9a2fdcfa7b40]: Loading DummyStrategy
superset_worker       | [2024-05-24 23:38:00,095: INFO/ForkPoolWorker-7] cache-warmup[c8b9a20b-3fea-45e5-a9fc-9a2fdcfa7b40]: Success!
superset_worker       | [2024-05-24 23:38:00,161: INFO/ForkPoolWorker-7] cache-warmup[c8b9a20b-3fea-45e5-a9fc-9a2fdcfa7b40]: Scheduling {"chart_id": 1}
superset_worker       | [2024-05-24 23:38:00,181: INFO/MainProcess] Task fetch_url[efd6a561-9c01-455c-aaae-bf8d4e6fb88e] received
superset_worker       | [2024-05-24 23:38:00,185: INFO/ForkPoolWorker-7] Task cache-warmup[c8b9a20b-3fea-45e5-a9fc-9a2fdcfa7b40] succeeded in 0.09269619999395218s: {'scheduled': ['{"chart_id": 1}'], 'errors': []}
superset_worker       | [2024-05-24 23:38:00,202: INFO/ForkPoolWorker-8] fetch_url[efd6a561-9c01-455c-aaae-bf8d4e6fb88e]: Fetching http://webserver.superset.local:8088/api/v1/security/csrf_token/
superset_app          | 2024-05-24 23:38:00,259:INFO:werkzeug:192.168.214.6 - - [24/May/2024 23:38:00] "GET /api/v1/security/csrf_token/ HTTP/1.1" 200 -
superset_worker       | [2024-05-24 23:38:00,261: INFO/ForkPoolWorker-8] fetch_url[efd6a561-9c01-455c-aaae-bf8d4e6fb88e]: Fetching http://webserver.superset.local:8088/api/v1/chart/warm_up_cache with payload {"chart_id": 1}
superset_app          | 2024-05-24 23:38:02,332:INFO:werkzeug:192.168.214.6 - - [24/May/2024 23:38:02] "PUT /api/v1/chart/warm_up_cache HTTP/1.1" 200 -
superset_worker       | [2024-05-24 23:38:02,333: INFO/ForkPoolWorker-8] fetch_url[efd6a561-9c01-455c-aaae-bf8d4e6fb88e]: Fetched http://webserver.superset.local:8088/api/v1/chart/warm_up_cache with payload {"chart_id": 1}, status code: 200
superset_worker       | [2024-05-24 23:38:02,335: INFO/ForkPoolWorker-8] Task fetch_url[efd6a561-9c01-455c-aaae-bf8d4e6fb88e] succeeded in 2.1522701479989337s: {'success': '{"chart_id": 1}', 'response': '{"result":[{"chart_id":1,"viz_error":null,"viz_status":"success"}]}

ADDITIONAL INFORMATION

  • Has associated issue: Chart cache warmup fails with 405 Method Not Allowed #28705
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@rmasters rmasters changed the title fix: cache-warmup task fails on Superset 4.0 fix: Chart cache-warmup task fails on Superset 4.0 May 25, 2024
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Congrats on making your first PR and thank you for contributing to Superset! 🎉 ❤️

We hope to see you in our Slack community too! Not signed up? Use our Slack App to self-register.

@pull-request-size pull-request-size bot added size/M and removed size/S labels May 25, 2024
superset/tasks/cache.py Outdated Show resolved Hide resolved
@rmasters rmasters marked this pull request as ready for review May 25, 2024 01:45
@dosubot dosubot bot added api Related to the REST API infra:caching Infra setup and configuration related to caching labels May 25, 2024
@admsev
Copy link

admsev commented May 25, 2024

@gooroodev please review

@gooroodev
Copy link

Summary of Changes

  1. New Functionality:

    • Added a new function fetch_csrf_token to fetch a CSRF token for API requests.
    • Integrated the fetch_csrf_token function into the fetch_url task to ensure that CSRF tokens are included in the headers for API requests.
  2. Modifications:

    • Updated the fetch_url method to use ChartRestApi.warm_up_cache instead of Superset.warm_up_cache.
    • Modified the integration test test_fetch_url to include tests for the new fetch_csrf_token function.
  3. Tests:

    • Added a new test test_fetch_csrf_token to verify the functionality of fetching CSRF tokens.
    • Updated the test_fetch_url to mock and validate the behavior of fetch_csrf_token.

Issues, Bugs, or Typos

  1. Error Handling in fetch_csrf_token:

    • The function fetch_csrf_token logs an error but does not raise an exception or handle the error case effectively. This can lead to silent failures.
  2. Variable Naming:

    • The variable session_cookie should be checked for None before being added to the headers to avoid potential issues if the cookie is not set.
  3. Test Improvements:

    • The test test_fetch_csrf_token should include a case where the response code is not 200 to ensure error handling is tested.

Proposed Code Improvements

Improvement 1: Error Handling in fetch_csrf_token

Current Code:

logger.error("Error fetching CSRF token, status code: %s", response.code)
return {}

Improved Code:

logger.error("Error fetching CSRF token, status code: %s", response.code)
raise Exception(f"Failed to fetch CSRF token, status code: {response.code}")

Improvement 2: Check for None in session_cookie

Current Code:

session_cookie = response.headers.get("Set-Cookie")
if response.code == 200:
    data = json.loads(body)
    return {"Cookie": session_cookie, "X-CSRF-Token": data["result"]}

Improved Code:

session_cookie = response.headers.get("Set-Cookie")
if response.code == 200 and session_cookie:
    data = json.loads(body)
    return {"Cookie": session_cookie, "X-CSRF-Token": data["result"]}
else:
    logger.error("Error fetching CSRF token or session cookie missing, status code: %s", response.code)
    raise Exception(f"Failed to fetch CSRF token or session cookie missing, status code: {response.code}")

Improvement 3: Test for Non-200 Response Code

Current Test:

def test_fetch_csrf_token(mock_urlopen, mock_request_cls, base_url, app_context):
    # ... existing test code ...

Improved Test:

def test_fetch_csrf_token(mock_urlopen, mock_request_cls, base_url, app_context):
    from superset.tasks.cache import fetch_csrf_token

    mock_request = mock.MagicMock()
    mock_request_cls.return_value = mock_request

    mock_response = mock.MagicMock()
    mock_urlopen.return_value.__enter__.return_value = mock_response

    # Test for successful response
    mock_response.code = 200
    mock_response.read.return_value = b'{"result": "csrf_token"}'
    mock_response.headers.get.return_value = "new_session_cookie"

    app.config["WEBDRIVER_BASEURL"] = base_url
    headers = {"Cookie": "original_session_cookie"}

    result_headers = fetch_csrf_token(headers)

    mock_request_cls.assert_called_with(
        "http://base-url/api/v1/security/csrf_token/",
        headers=headers,
        method="GET",
    )

    assert result_headers["X-CSRF-Token"] == "csrf_token"
    assert result_headers["Cookie"] == "new_session_cookie"
    mock_urlopen.assert_called_once_with(mock_request, timeout=mock.ANY)

    # Test for unsuccessful response
    mock_response.code = 500
    with pytest.raises(Exception):
        fetch_csrf_token(headers)

General Review of Code Quality and Style

  • Code Quality: The code is generally well-structured and follows good practices. The addition of the fetch_csrf_token function abstracts the CSRF token fetching logic, making the fetch_url function cleaner and more maintainable.
  • Code Style: The code follows PEP 8 guidelines, and the use of type hints improves readability and maintainability.
  • Logging: The use of logging is appropriate for debugging and tracking the flow of the program, but error handling could be more robust.
  • Testing: The tests cover the new functionality well, but additional edge cases and error conditions should be tested to ensure robustness.

Yours, Gooroo.dev. To receive reviews automatically, install Github App

Copy link

codecov bot commented May 29, 2024

Codecov Report

Attention: Patch coverage is 94.28571% with 2 lines in your changes missing coverage. Please review.

Project coverage is 83.65%. Comparing base (76d897e) to head (620502d).
Report is 714 commits behind head on master.

Files with missing lines Patch % Lines
superset/tasks/utils.py 93.75% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master   #28706       +/-   ##
===========================================
+ Coverage   60.48%   83.65%   +23.16%     
===========================================
  Files        1931      528     -1403     
  Lines       76236    38181    -38055     
  Branches     8568        0     -8568     
===========================================
- Hits        46114    31941    -14173     
+ Misses      28017     6240    -21777     
+ Partials     2105        0     -2105     
Flag Coverage Δ
hive 48.93% <31.42%> (-0.23%) ⬇️
javascript ?
mysql 76.66% <94.28%> (?)
postgres 76.75% <94.28%> (?)
presto 53.48% <31.42%> (-0.32%) ⬇️
python 83.65% <94.28%> (+20.16%) ⬆️
sqlite 76.24% <94.28%> (?)
unit 60.28% <28.57%> (+2.65%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@gempir
Copy link

gempir commented Jun 24, 2024

Is there anything I can do to help get this merged?

I'm also interested in working cache warmups

@rusackas
Copy link
Member

  • PEP

Feel free to give it a code review, and/or answer any of the open questions on the thread.

Still hoping to get an opinion from @betodealmeida and/or @dpgaspar , but I'll add @john-bodley to the mix as well.

@cmartinezdnn
Copy link

I also want to use cache-warmup in Superset 4.0.2. When is it planned to include this fix? Thank you!

@talfour
Copy link

talfour commented Jul 7, 2024

Thanks for providing solution, however this wouldn't work if someone is using superset-websockets as in the headers there are two "Set-Cookie"'s first one is an async-token and second session token, your code will get the first one which is wrong and you'll get 404 UNATHORIZED

@fernandoadam
Copy link

I also want to use cache-warmup in Superset 4.0.2. When is it planned to include this fix? Thank you!

I installed the version 4.0.2, and applied the fix suggested by @rmasters in the cache.py file.
And now it's working.

Before the fix, I was getting the 405 error.

@rusackas
Copy link
Member

Would love to close the related issue, but I'm not sure of the risks involved here (particularly with websockets added to the mix). Maybe @villebro or @nytai would have a look here and see if this seems save to merge.

@rusackas rusackas requested review from villebro and nytai July 30, 2024 22:55
Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really great work, sorry for the long delay in reviewing!

superset/tasks/cache.py Outdated Show resolved Hide resolved
@villebro
Copy link
Member

villebro commented Aug 2, 2024

@rmasters this needs a rebase, and I don't have permissions to push changes to your branch to do it. Can you rebase, and optionally make that utils refactor?

@jacobeatsspam
Copy link

@cmartinezdnn Have you tested that condition??

@rusackas
Copy link
Member

rusackas commented Aug 7, 2024

Looks like some CI checks aren't running for whatever reason. Might be in need of the aforementioned rebase, but I'll close and reopen this to see if that kick-starts it.

@rusackas rusackas closed this Aug 7, 2024
@rusackas rusackas reopened this Aug 7, 2024
@github-actions github-actions bot removed the api Related to the REST API label Aug 7, 2024
@shohamyamin
Copy link

You think that this could be merged soon?
What I need to change in the code to fix that issue manually?

@tamarinkeisari
Copy link

I also want to use cache-warmup in Superset 4.0.2. When is it planned to include this fix? Thank you!

I installed the version 4.0.2, and applied the fix suggested by @rmasters in the cache.py file. And now it's working.

Before the fix, I was getting the 405 error.

Hi, can you please deliberate on what did you change so it will work for you?
I'm encountering the same issue

@jacobeatsspam
Copy link

Yeah... any traction on this would be amazing

@rmasters
Copy link
Contributor Author

rmasters commented Aug 20, 2024

Apologies for the radio silence; I don't seem to be getting github notifications through to my email...

Many thanks for the reviews. I'll rebase and pop that refactor to utils in shortly.


@talfour Thanks for providing solution, however this wouldn't work if someone is using superset-websockets as in the headers there are two "Set-Cookie"'s first one is an async-token and second session token, your code will get the first one which is wrong and you'll get 404 UNATHORIZED

I'm not familiar with the websocket server, but am I right in understanding that it should not pick up the async/websocket-server cookie, or does it proxy requests at all? It doesn't appear that the FAB session cookie name can be changed from session so I can filter for that (it has been years since I used FAB though so if anyone knows different please lmk).

Edit: I've popped a change in that parses out all the set-cookie headers and picks the one named session - hopefully this does the trick

- Updates fetch_url task to use /api/v1/chart/warm_up_cache/
- Adds CSRF fetcher to use this endpoint
This could fail when other cookies are set, such as the websockets
server async-token cookie.
@pull-request-size pull-request-size bot added size/L and removed size/M labels Aug 20, 2024
@rmasters
Copy link
Contributor Author

Hey @villebro, I've rebased onto master, split out the csrf-token fetcher and made a fix towards the session cookie grabber, if you can trigger the CI for me please 🚀

@fzhan
Copy link

fzhan commented Sep 6, 2024

@rmasters thanks for the rebase. Any changes we need to add in terms of ingress annotation for the chart?

@villebro
Copy link
Member

villebro commented Sep 6, 2024

I've started CI and will try to review this tomorrow (PST)

@villebro
Copy link
Member

villebro commented Sep 6, 2024

Waiting for flaky tests to rerun, will try to merge as soon as CI passes.

@villebro
Copy link
Member

villebro commented Sep 6, 2024

One last flaky test to go.. 🤦

@villebro villebro merged commit 0744abe into apache:master Sep 7, 2024
37 checks passed
@sadpandajoe sadpandajoe added the v4.1 Label added by the release manager to track PRs to be included in the 4.1 branch label Sep 11, 2024
sadpandajoe pushed a commit that referenced this pull request Sep 11, 2024
@tamarinkeisari
Copy link

Hi in which version this PR is going to be released?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
infra:caching Infra setup and configuration related to caching size/L v4.1 Label added by the release manager to track PRs to be included in the 4.1 branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Chart cache warmup fails with 405 Method Not Allowed