Replies: 5 comments
-
@GerorgeEG , did you try mounting the new storage "by hand”? side note: " /mnt/a288c84e-d100-334b-9bc3-0d79ffe9a610/KVMHA//hb-“ does not look like a valid heartbeat file to try and create. An ip/hostname is missing:
This might indicate a network configuration mistake. Is the directory |
Beta Was this translation helpful? Give feedback.
-
Hi @DaanHoogland, thanks for picking this up, we found that it was the issue with NFS share and that why mounting was stuck and hosts got rebooted, but we want to avoid reboot of hosts if only one of the Primary Storage is having issue but others are accessible. Is there any way we can prevent the reboot of KVMs. I found one setting and it is already disabled other than I am not able to avoid this issueKvm ha fence on storage heartbeat failure (kvm.ha.fence.on.storage.heartbeat.failure) |
Beta Was this translation helpful? Give feedback.
-
If you want the host not to be rebooted when write heartbeat fails, please add/change the value in agent.properties
and restart cloudstack-agent service |
Beta Was this translation helpful? Give feedback.
-
thanks , will check and validate this in our environment. |
Beta Was this translation helpful? Give feedback.
-
I hit the same issue few weeks ago and can confirm the agent.properties certainly helped. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
problem
Multiple hosts got rebooted while adding NFS primary storage
ACS version: 4.19.1.2
KVM:REHL 8.10
NFS : v3 with nolock option
Below is the error from one of host
message log:
java[4095]: WARN [kvm.storage.LibvirtStoragePool] (Thread-1:) (logid:) Process [2479886] for command [/usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/kvmheartbeat.sh -i -p /virt/NFS path -m /mnt/a288c84e-d100-334b-9bc3-0d79ffe9a610 -h ] encountered the error: [Failed to create /mnt/a288c84e-d100-334b-9bc3-0d79ffe9a610/KVMHA//].
agent log :
WARN [kvm.resource.KVMHAMonitor] (Thread-1:null) (logid:) Write heartbeat for pool [a288c84e-d100-334b-9bc3-0d79ffe9a610] failed: Failed to create /mnt/a288c84e-d100-334b-9bc3-0d79ffe9a610/KVMHA//hb-; try: 5 of 5.
versions
The versions of ACS, hypervisors, storage, network etc..
The steps to reproduce the bug
...
What to do about it?
No response
Beta Was this translation helpful? Give feedback.
All reactions