-
Notifications
You must be signed in to change notification settings - Fork 13.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Worker instance keep restarting after upgrade to 4.0.0-dev #28445
Comments
To address the issue of worker instances continuously restarting after upgrading to Superset version 4.0.0, consider the following steps:
These steps should help identify and resolve the issue with worker instances restarting. If the problem persists, further investigation into Superset's change logs for version 4.0.0 may be necessary to identify any breaking changes or additional migration steps required.
|
It's an issue with Liveness probe :
|
After running 'celery -A superset.tasks.celery_app:app inspect ping -d celery@$HOSTNAME' manually in worker pod, response was Error:
|
After done some digging, it appears to be related to the Celery version issue with Redis. #28471 |
After upgrade to celery 5.4, flower can see workers listed, but the ping for liveness within worker still fails with no nodes replied. |
LivenessProbe still fails with 'Error: No nodes replied within time constraint' as result from "celery -A superset.tasks.celery_app:app inspect ping -d celery@$HOSTNAME" |
Was this problem resolved? I'm also getting the |
The worker instances keep restarting after upgrading to Superset 4.0.0-dev likely due to the upgrade of Celery and its dependencies packages as mentioned in the changelog: - [#26604](https://github.com/apache/superset/pull/26604) feat(celery): upgrade celery and its dependencies packages (@Musa10) This upgrade might have introduced changes or incompatibilities that are causing the worker instances to restart. Additionally, the For the specific configuration changes required for Celery 5.X in the class CeleryConfig:
broker_url = f"redis://{REDIS_HOST}:{REDIS_PORT}/{REDIS_CELERY_DB}"
imports = (
"superset.sql_lab",
"superset.tasks.scheduler",
"superset.tasks.thumbnails",
"superset.tasks.cache",
)
result_backend = f"redis://{REDIS_HOST}:{REDIS_PORT}/{REDIS_RESULTS_DB}"
worker_prefetch_multiplier = 1
task_acks_late = False
beat_schedule = {
"reports.scheduler": {
"task": "reports.scheduler",
"schedule": crontab(minute="*", hour="*"),
},
"reports.prune_log": {
"task": "reports.prune_log",
"schedule": crontab(minute=10, hour=0),
},
}
CELERY_CONFIG = CeleryConfig This configuration includes settings for the broker URL, result backend, task imports, worker prefetch multiplier, task acknowledgments, and beat schedule [3][4]. Additionally, ensure that your command line invocation for the worker is updated to match the new Celery 5.X requirements. For example: command:
- "/bin/sh"
- "-c"
- ". {{ .Values.configMountPath }}/superset_bootstrap.sh; celery --app=superset.tasks.celery_app:app worker" Make sure to review and adjust your configurations accordingly to align with the new Celery 5.X guidelines. |
@dosu, in this case, what is your cache_config? here's mine:
|
The # Default cache for Superset objects
CACHE_CONFIG: CacheConfig = {
"CACHE_TYPE": "RedisCache",
"CACHE_DEFAULT_TIMEOUT": 300,
"CACHE_KEY_PREFIX": "superset_",
"CACHE_REDIS_HOST": os.getenv("REDIS_HOST", "redis"),
"CACHE_REDIS_PORT": os.getenv("REDIS_PORT", "6379"),
"CACHE_REDIS_DB": os.getenv("REDIS_RESULTS_DB", "1"),
}
# Cache for datasource metadata and query results
DATA_CACHE_CONFIG: CacheConfig = CACHE_CONFIG
# Cache for dashboard filter state. `CACHE_TYPE` defaults to `SupersetMetastoreCache`
# that stores the values in the key-value table in the Superset metastore, as it's
# required for Superset to operate correctly, but can be replaced by any
# `Flask-Caching` backend.
FILTER_STATE_CACHE_CONFIG: CacheConfig = {
"CACHE_TYPE": "SupersetMetastoreCache",
"CACHE_DEFAULT_TIMEOUT": int(timedelta(days=90).total_seconds()),
# Should the timeout be reset when retrieving a cached value?
"REFRESH_TIMEOUT_ON_RETRIEVAL": True,
# The following parameter only applies to `MetastoreCache`:
# How should entries be serialized/deserialized?
"CODEC": JsonKeyValueCodec(),
}
# Cache for explore form data state. `CACHE_TYPE` defaults to `SupersetMetastoreCache`
# that stores the values in the key-value table in the Superset metastore, as it's
# required for Superset to operate correctly, but can be replaced by any
# `Flask-Caching` backend.
EXPLORE_FORM_DATA_CACHE_CONFIG: CacheConfig = {
"CACHE_TYPE": "SupersetMetastoreCache",
"CACHE_DEFAULT_TIMEOUT": int(timedelta(days=7).total_seconds()),
# Should the timeout be reset when retrieving a cached value?
"REFRESH_TIMEOUT_ON_RETRIEVAL": True,
# The following parameter only applies to `MetastoreCache`:
# How should entries be serialized/deserialized?
"CODEC": JsonKeyValueCodec(),
} Superset uses Flask-Caching for caching purposes. Flask-Caching supports various caching backends, including Redis (recommended), Memcached, SimpleCache (in-memory), or the local filesystem. Custom cache backends are also supported. For example, to configure the filter state cache using Redis: FILTER_STATE_CACHE_CONFIG = {
'CACHE_TYPE': 'RedisCache',
'CACHE_DEFAULT_TIMEOUT': 86400,
'CACHE_KEY_PREFIX': 'superset_filter_cache',
'CACHE_REDIS_URL': 'redis://localhost:6379/0'
} Dependencies for dedicated cache stores include: If the required caches are undefined, Superset falls back to using a built-in cache that stores data in the metadata database. For example, to use the built-in cache to store chart data: DATA_CACHE_CONFIG = {
"CACHE_TYPE": "SupersetMetastoreCache",
"CACHE_KEY_PREFIX": "superset_results",
"CACHE_DEFAULT_TIMEOUT": 86400,
} The cache timeout for charts can be overridden by settings for an individual chart, dataset, or database. Setting the cache timeout to Caching for SQL Lab query results is configured using For caching thumbnails, enable the feature flag and configure a cache system and celery workers. For example, to store images on S3: from flask import Flask
from s3cache.s3cache import S3Cache
class CeleryConfig(object):
broker_url = "redis://localhost:6379/0"
imports = (
"superset.sql_lab",
"superset.tasks.thumbnails",
)
result_backend = "redis://localhost:6379/0"
worker_prefetch_multiplier = 10
task_acks_late = True
CELERY_CONFIG = CeleryConfig
def init_thumbnail_cache(app: Flask) -> S3Cache:
return S3Cache("bucket_name", 'thumbs_cache/')
THUMBNAIL_CACHE_CONFIG = init_thumbnail_cache
THUMBNAIL_SELENIUM_USER = "Admin" Using the above example, cache keys for dashboards will be The |
Nope, I have tried "celery==5.5.0b2" still doesn't work |
Bug description
After upgrade to 4.0.0, the workers keeps restarting
How to reproduce the bug
Install helm with this value.yaml
Screenshots/recordings
Logs from the worker pod
Superset version
4.0.0
Python version
3.10
Node version
16
Browser
Chrome
Additional context
No response
Checklist
The text was updated successfully, but these errors were encountered: