-
Notifications
You must be signed in to change notification settings - Fork 188
VMDistributedCluster CR: orchestrate VMCluster upgrades #1556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
initially thought distributed CR is needed for full distributed setup management, but looks like it only performs version upgrade. In this case just curious why we need different CRs for VM, VT and VL? |
eaeacd4 to
c02b24c
Compare
|
Yes, so far we're focusing on upgrades - existing CRs provide sufficient flexibility IMO - and we didn't get a request for other actions so far.
VL and VT don't have agents (yet) so their specs would be different. However we can reuse the same approach and probably even some helper functions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, so far we're focusing on upgrades - existing CRs provide sufficient flexibility IMO - and we didn't get a request for other actions so far.
I believe users would expect to modify the vmcluster spec value or apply extra flags to the vmclusters.
And since vmclusterSpec.ClusterVersion is optional, users could specify component versions inside vmclusterSpec which overrides the vmclusterSpec.ClusterVersion.
And currently, it seems VMDistributedCluster only covers a limited scenario where resources like vmcluster, vmuser, vmauth are defined and configured as needed.
Could you please provide an example of how to config them to achieve similar topology described in victoria-metrics-distributed chart? I expect VMDistributedCluster to be supported there when released.
| if vmClusterAgentPair.VMAgent != nil { | ||
| vmAgent = &vmAgentAdapter{VMAgent: vmClusterAgentPair.VMAgent} | ||
| } | ||
| waitForVMClusterVMAgentMetrics(ctx, httpClient, vmAgent, deadline) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not need to check the vmagent persistent queue here; it should occur after updating the vmcluster and before resuming vmuser reading from the vmcluster, which ensures data integrity on the upgraded vmcluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, so I should move it after waitForVMClusterReady, not before, right?
| if err != nil { | ||
| return false, err | ||
| } | ||
| return metricValue == 0, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The value of vmagent_remotewrite_pending_data_bytes may exceed 0 sometimes because it also includes in-memory data not yet flushed to remote storage. it's better to set a small threshold, such as 1e6; if the value is below this, the queue can be considered drained.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to use metric vm_persistentqueue_bytes_pending instead. It doesn't take in account in-memory part of the queue
Yup, setting generic |
… uses server-side apply This prevents kubectl from creating an annotation which is too large and failing with `The CustomResourceDefinition "vmdistributedclusters.operator.victoriametrics.com" is invalid: metadata.annotations: Too long: may not be more than 262144 bytes` error
8945925 to
1441ebe
Compare
280b2e6 to
04b44f9
Compare
Add a new CR -
VMDistributedCluster-so that multiple VMClusters could be upgraded in orchestrated fashion, ensuring the read VMAuth is disabled before upgrade and VMAgent (if available) doesn't have pending bytes to send.See #1515 (comment) for agreed limitations for v1alpha1 version.
Fixes #1515
TODO:
Keeping original commits for review as its useful to show how the feature was developed
During development I realized VMAuth / VMClusters may be in different namespaces. The initial version requires all objects to be in the same namespace as VMDistributedCluster. Do we want to keep it that way for API simplicity or worth having the initial version with cross-namespace support?