-
-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request]: VMs in a HA-Cluster #1534
Comments
This will be very useful also if you enable some "cluster wide" load balancer! |
I think using
Would this be sufficient? |
Good point... @bpg can this be helpful? |
In my opinion, this would be a possible solution for existing servers. However, it would also be good to define for new servers that it does not matter on which node they are deployed. This way, it is not necessary to explicitly state which node should be used. |
That should work to prevent VM movement or recreation by the provider if no VM attributes have changed. However, if there is an update to a VM, and the VM is not currently on the node where terraform thinks it is, the update would probably fail. |
I'll try to update my module and do some test also if I don't have reachable clusters at the moment. I'll update this thread as soon as I can. :) |
On single node, the workaround works well, also for modifications (scale-up/scale-down of resources). As I already said...at the moment, I don't have a reachable cluster to test on it. |
I did some simple tests with this, and it seems to work fine. I deployed a VM to one node, then moved it via the Proxmox console to another node, and then changed the amount of memory assigned to that VM and the provider updated it just fine. So the testing wasn't exactly thorough, but seemed to at least pass the smell test. |
I totally get this and agree. But I'm not sure this functionality living in the provider is the right place for it. Philosophically, and the way Terraform and all the providers I've used seem to function, is that they assume that whatever resources are specified in the TF config are always available. This feels like the right approach to me since the expectations of what the provider should do when encountering this situation will vary wildly depending on one's particular use-case. For example, what should the AWS provider do when TF tries to enforce some state on an S3 bucket but that bucket was deleted through some other means? And if you add any additional nodes to your HA group, you'd still have to modify the list of nodes in your TF variable so they get included in rotation for future deployments. As a suggestion, maybe you can use a combination of the data sources in the provider to dynamically create the list of nodes you want to deploy to? There's a datasource for HA groups (proxmox_virtual_environment_hagroups) which returns the list of groups. HA group (proxmox_virtual_environment_hagroup) can return the list of nodes in a group. And then nodes (proxmox_virtual_environment_nodes) can return the list of nodes and their online status. Perhaps with a little TF list manipulation you can break that down into a dynamic list of deployable nodes at runtime? For me personally, I think this is really a Proxmox feature request. What I'd like to do is to deploy a VM to an HA group instead of a specific node, and have the node scheduling and node balancing rules live inside Proxmox itself. I think it would be really powerful to have Proxmox build in some of the scheduling properties from K8s but for VMs. |
Thanks @elias314 for testing and further analysis. It actually makes sense that the
That could be one solution, but from the note above, its probably not even necessary. I think a way out of this mess would be simply treating |
Hi Everyone, I stumbled across this issue some time ago when it was first created. I like your ideas that it’s a feature as it is, but I have to disagree. In Terraform, we are talking about a stateful state. If we have a single node where a VM can be and this node is offline, the VM cannot be issued to another host; otherwise, the state would not be fulfilled. However, if we have a list of nodes where the VM can go and it’s on a node in this list, the state would be fulfilled. Your S3 analogy is not quite right here in my opinion. In S3, if the bucket was deleted (similar to taking down a node), either another node from the list will be used (another S3 bucket) or the Terraform would fail. Both of these outcomes are acceptable to me. However, I am not fine with having only one node where it should be deployed but finding it on another and the state is fine with it. Best wishes, |
I already do this with Of course, if you want only certain nodes, the @elias314's suggestion is more precise. |
I'm talking more about the drift detection use case, when VM is moved after creation. |
You're right, Terraform is trying to enforce a stateful state. And this is why this feels uncomfortable to me; implementing a feature like this means there are now multiple possible acceptable states. But those possible states are dictated by the provider and not explicitly codified in the TF files. While this particular feature request seems pretty straight-forward, you can imagine other similar feature requests where the outcome of multiple possible acceptable states may not be desirable for someone's particular use-case. This seems like a slippery slope to me. |
While I understand the intention to use the provider as a frontend and load balancer for the Proxmox cluster, which is essentially what this ticket suggests, this is not functionality that the provider should have. As mentioned above, Terraform is not well-suited for managing a dynamic state that changes outside of its control. My main guideline for making design decisions for the provider is to follow the Proxmox API as closely as possible. If a requested feature is not available via the API or UI, it is unlikely to be implemented. From the ticket:
This could be resolved by handling
This describes load-balancing functionality, but Proxmox does not allow specifying an existing HA group for VM creation, only individual nodes. So, unfortunately, the answer is no. |
@bpg I completely understand your decisions. And if this feature is not added, I accept that - after all, it's not my decision.
In this concept, I would understand load-balancing to mean that the provider checks which of the nodes have the lowest load and creates the VM on this node.
I abolosutely agree with that. A change of node can of course be completely ignored in this case via the @elias314 I would also like to say something about this point from my point of view.
For me, including a list of possible nodes does not represent a state that has several acceptable statuses, but only one. The only acceptable state is when the VM is running on one of the nodes entered in the HA group. Otherwise the state is not correct.
To prevent exactly this problem, the suggestion would be to set the parameter I am aware that something like this should rather be realized directly by Proxmox, but it is rather unlikely that this will be implemented, based on this statement by an employee:
So I would ask you @bpg to think about this again, as I believe that this would be an eclusion criterion for many people who want to use HA properly. |
Again, I too wish this feature existed -- in Proxmox. Its too bad that they don't seem interested in this feature as it seems obvious to me that the next step in larger infrastructures would be to deploy to an HA Group instead of a specific node. As a workaround, give this a try:
I've only done some limited testing, but this should give you a dynamic list of online nodes at runtime that you can deploy to. Couple this with |
Is your feature request related to a problem? Please describe.
Currently, a fixed node must be selected for each resource. In an HA cluster with load balancing, a VM can be moved back and forth between all nodes. If you execute "terraform apply" and one or more VMs were automatically moved to different nodes, the provider will move the VMs back to their defined node or will be recreate the VM on the defined node.
Describe the solution you'd like
It would be a good idea to implement a
node_names
variable in addition to thenode_name
variable, which are mutually exclusive. Any node on which the VM can be located can be listed via this variable. So if HA is activated for a VM and it is located on a different node than specified, it is treated as it exists already and will not be migrated back or recreated.If a VM does not yet exist, it is created by a random mechanism on one of the specified nodes.
Additional context
An example configuration could look like this:
The text was updated successfully, but these errors were encountered: