You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Right now there is 1 replica running at a time. A second can run temporarily during deployment, but we should be running a minimum of 2 at all times in-case one of them crashes. Furthermore, we should be able to horizontally scale with demand.
The problem with simply enabling multiple replicas is that the background jobs will also have multiple replicas which may or may not be ok.
Watcher expiration job - Runs hourly and deletes any expired watchers from the database. Does not require HA, and it's not a big deal if multiple of them are running as it just runs 1 query.
Relay renewal job - Runs daily and renews all topics. Does not require HA. This is resource intensive and only 1 should run at a time. Also with Renew topic subscriptions only when they need to be #325 the architecture will change a bit and we may want some locking ability, or other mechanism to avoid multiple revisions renewing the same topic twice unnecessarily.
Publisher service - Runs continuously and publishes any messages that need to be published. Ideally has HA in-case of crash, but not critical if a notification is delayed a few minutes in rare circumstance as notifications are often delayed anyway with large queue sizes. This service may be horizontally scaled with the number of replicas, but ideally can be independently scaled in order to be better control relay load and queue processing time.
Conclusion: we can enable horizontal scaling following the change to avoid multiple relay renewal jobs running at once.
This will be non-trivial and will require some type of lock. We may be able to implement a lock with Redis but this is throw-away work once we do #325 so it may be desirable to go that direction now.
The text was updated successfully, but these errors were encountered:
Conclusion: we can enable horizontal scaling following the change to avoid multiple relay renewal jobs running at once.
This assumption has changed because the batch_subscribe is now cheap and publishes are not required. This allows us to run renew operations potentially in parallel and it's not a big deal.
Right now there is 1 replica running at a time. A second can run temporarily during deployment, but we should be running a minimum of 2 at all times in-case one of them crashes. Furthermore, we should be able to horizontally scale with demand.
The problem with simply enabling multiple replicas is that the background jobs will also have multiple replicas which may or may not be ok.
Conclusion: we can enable horizontal scaling following the change to avoid multiple relay renewal jobs running at once.
This will be non-trivial and will require some type of lock. We may be able to implement a lock with Redis but this is throw-away work once we do #325 so it may be desirable to go that direction now.
The text was updated successfully, but these errors were encountered: