fix: refresh radosgw.conf mon host from UpdateConfig#772
Open
sabaini wants to merge 1 commit into
Open
Conversation
radosgw.conf's "mon host" line was written once at RGW enable time by EnableRGW() and never refreshed, while ceph.conf is re-rendered every minute by the Start() goroutine via UpdateConfig(). As monitors join the cluster, radosgw.conf's mon host list goes permanently stale (e.g. only 2 of 3 monitors), so radosgw can fail to bootstrap when every listed monitor is down at startup while an unlisted one is up. (Downstream: LP snap-openstack bug 2095567.) Keep radosgw.conf's mon host in sync from the same refresh path that updates ceph.conf: add updateRadosGWMonHost(), which reuses the existing atomic in-place line rewriter fixConfigLine (sibling to fixRadosGWRunDir) to rewrite only the "mon host =" line. This preserves the unpersisted RGW frontend port/SSL settings, which cannot be re-rendered from scratch. UpdateConfig() invokes it right after formatting the monitor addresses, so both files share one source of truth, and warns-and-continues on error so a radosgw.conf write failure cannot block the ceph.conf refresh. A missing file (RGW disabled) is a no-op; an already-current line is left untouched (idempotent). Also switch UpdateConfig() to the injectable fetchConfigDb seam (already used by backwardCompatPubnet) so the refresh path is testable without a real database. Fixes canonical#766 Assisted-by: pi:z-ai/glm-5.2 Signed-off-by: Peter Sabaini <peter.sabaini@canonical.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
radosgw.conf's "mon host" line was written once at RGW enable time by EnableRGW() and never refreshed, while ceph.conf is re-rendered every minute by the Start() goroutine via UpdateConfig(). As monitors join the cluster, radosgw.conf's mon host list goes permanently stale (e.g. only 2 of 3 monitors), so radosgw can fail to bootstrap when every listed monitor is down at startup while an unlisted one is up. (Downstream: LP snap-openstack bug 2095567.)
Keep radosgw.conf's mon host in sync from the same refresh path that updates ceph.conf: add updateRadosGWMonHost(), which reuses the existing atomic in-place line rewriter fixConfigLine (sibling to fixRadosGWRunDir) to rewrite only the "mon host =" line. This preserves the unpersisted RGW frontend port/SSL settings, which cannot be re-rendered from scratch. UpdateConfig() invokes it right after formatting the monitor addresses, so both files share one source of truth, and warns-and-continues on error so a radosgw.conf write failure cannot block the ceph.conf refresh. A missing file (RGW disabled) is a no-op; an already-current line is left untouched (idempotent).
Also switch UpdateConfig() to the injectable fetchConfigDb seam (already used by backwardCompatPubnet) so the refresh path is testable without a real database.
Fixes #766
Assisted-by: pi:z-ai/glm-5.2
Type of change
Delete options that are not relevant.
How has this been tested?
included tests
Contributor checklist
Please check that you have: