Skip to content

Conversation

wmTJc9IK0Q
Copy link
Contributor

See #770

@wmTJc9IK0Q wmTJc9IK0Q requested a review from a team as a code owner October 6, 2025 02:26
@Copilot Copilot AI review requested due to automatic review settings October 6, 2025 02:26
@CLAassistant
Copy link

CLAassistant commented Oct 6, 2025

CLA assistant check
All committers have signed the CLA.

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements a mechanism to skip the render-config operation when the output file already exists, addressing issue #770. The change adds an early exit condition to prevent unnecessary processing when the target output file is already present.

  • Adds file existence check before processing render-config
  • Exits early with status 0 and informational message when output file exists

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link
Member

@benbz benbz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this. My preference would be to delete the file if it exists and have it be re-rendered so that we know it is up-to-date (e.g. any external secrets / appservice registrations could change at any time). My understanding is that you should have permissions to do this as the permissions will come from the parent directory rather than the file itself (as this process should be running with the user id the owner). Does that sound like a way forward that would work for you?

@wmTJc9IK0Q
Copy link
Contributor Author

wmTJc9IK0Q commented Oct 6, 2025

I considered that but I think there is an edge case there that could cause a config to be applied that maybe shouldn't be:

  • helm upgrade starts, configmap and secrets are updated
  • something fails during the upgrade, before the k8s.element.io/synapse-config-hash and k8s.element.io/synapse-secret-hash labels on the statefulset are updated which would cause a rollout of a new pod.
  • if the synapse pod is restarted in this scenario, it would get an updated config which could be wrong vs. continuing to work with the previous config

I think it's slightly safer to assume immutability of the config and only have the pod change it's config with the hash labels update.

@benbz
Copy link
Member

benbz commented Oct 7, 2025

I considered that but I think there is an edge case there that could cause a config to be applied that maybe shouldn't be:

* helm upgrade starts, configmap and secrets are updated

* something fails during the upgrade, before the `k8s.element.io/synapse-config-hash` and `k8s.element.io/synapse-secret-hash` labels on the statefulset are updated which would cause a rollout of a new pod.

* if the synapse pod is restarted in this scenario, it would get an updated config which could be wrong vs. continuing to work with the previous config

I think it's slightly safer to assume immutability of the config and only have the pod change it's config with the hash labels update.

Because of the check config hook the (chart provided) Synapse config won't be updated until that job has passed and the config is known valid. I think the likelihood of the the helm upgrade being interrupted between applying the updated ConfigMaps/Secrets and applying the updated StatefulSet is too small to worry about. For other components, yes we don't have check config hooks to validate their config but an admin should notice a failing helm upgrade command.

There is another case were the Pod could be restarted and get different config from when helm install/helm upgrade was last run: external Secrets change then Pod gets evicted from a Node. In that scenario the only thing I can see us being able to do to prevent that is that the Pod is given permissions to create/update Secrets in cluster to save off the config it started with as part of the render-config init-container. I dislike that as I don't think the applications should have permissions like that in general.

Given that we're left with at least one situation where the Pods could restart with different config to what they were originally started with, I would lean towards us being consistent and saying that config is always regenerated on startup regardless of whether it is a Pod being freshly (re)created or restarting.

@wmTJc9IK0Q
Copy link
Contributor Author

@benbz I have updated the PR to delete. I am testing this on my own cluster now.

@wmTJc9IK0Q wmTJc9IK0Q changed the title Skip render-config when output exists render-config: delete and re-create config when output file already exists Oct 7, 2025
Copy link

github-actions bot commented Oct 8, 2025

dyff of changes in rendered templates of CI manifests

No changes in rendered templates

@gaelgatelement gaelgatelement merged commit c7170f7 into element-hq:main Oct 8, 2025
69 of 70 checks passed
@gaelgatelement
Copy link
Member

Thanks for the PR! This will be part of next release through https://github.com/element-hq/ess-helm/pull/782/files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants