Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't restore VM in Proxmox #62

Open
paprikkafox opened this issue Jun 5, 2023 · 13 comments
Open

Can't restore VM in Proxmox #62

paprikkafox opened this issue Jun 5, 2023 · 13 comments

Comments

@paprikkafox
Copy link

Environment:

  • 3 HV Nodes with Linstor installed in HA mode (drbd-reactor)
  • Each has 3 disks in pool 'ssd_zpool1' (2 servers with 3 SSD and 1 server with 2 SSD+HDD)
  • 2 servers has 'hdd_zpool1' with 8 TB HDDs and 1 server has same pool with 2 TB HDDs
  • Each pool based on ZFS-Thin volumes
╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ StoragePool          ┊ Node       ┊ Driver   ┊ PoolName   ┊ FreeCapacity ┊ TotalCapacity ┊ CanSnapshots ┊ State ┊ SharedName ┊
╞══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ DfltDisklessStorPool ┊ SRVDMPVE01 ┊ DISKLESS ┊            ┊              ┊               ┊ False        ┊ Ok    ┊            ┊
┊ DfltDisklessStorPool ┊ SRVDMPVE02 ┊ DISKLESS ┊            ┊              ┊               ┊ False        ┊ Ok    ┊            ┊
┊ DfltDisklessStorPool ┊ SRVDMPVE03 ┊ DISKLESS ┊            ┊              ┊               ┊ False        ┊ Ok    ┊            ┊
┊ hdd_zpool1           ┊ SRVDMPVE01 ┊ ZFS      ┊ hdd_zpool1 ┊     6.11 TiB ┊      7.27 TiB ┊ True         ┊ Ok    ┊            ┊
┊ hdd_zpool1           ┊ SRVDMPVE02 ┊ ZFS      ┊ hdd_zpool1 ┊     1.76 TiB ┊      1.81 TiB ┊ True         ┊ Ok    ┊            ┊
┊ hdd_zpool1           ┊ SRVDMPVE03 ┊ ZFS      ┊ hdd_zpool1 ┊     6.11 TiB ┊      7.27 TiB ┊ True         ┊ Ok    ┊            ┊
┊ ssd_zpool1           ┊ SRVDMPVE01 ┊ ZFS      ┊ ssd_zpool1 ┊     1.75 TiB ┊      2.72 TiB ┊ True         ┊ Ok    ┊            ┊
┊ ssd_zpool1           ┊ SRVDMPVE02 ┊ ZFS      ┊ ssd_zpool1 ┊     1.74 TiB ┊      2.72 TiB ┊ True         ┊ Ok    ┊            ┊
┊ ssd_zpool1           ┊ SRVDMPVE03 ┊ ZFS      ┊ ssd_zpool1 ┊     1.74 TiB ┊      2.72 TiB ┊ True         ┊ Ok    ┊            ┊
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

Software Versions:

  • Proxmox Virtual Environment 7.4-3
  • Linstor stack - 1.18.0; GIT-hash: 9a2f939169b360ed3daa3fa2623dc3baa22cb509

Proxmox Plugin config:

drbd: net-vz-data
        resourcegroup hot_data
        content rootdir,images
        controller MULTIPLE_IPS_OF_CONTROLLERS
        preferlocal true
        statuscache 5

Problem:

When I try to create then restore backup of VM with TPM2.0 and EFI storage enabled im getting error about different disk sizes for EFI disk (used to store EFI vars)

vma: vma_reader_register_bs for stream drive-efidisk0 failed - unexpected size 5242880 != 540672

Full restore log:

restore vma archive: zstd -q -d -c /mnt/pve/net-share-01/dump/vzdump-qemu-101-2023_06_05-11_08_41.vma.zst | vma extract -v -r /var/tmp/vzdumptmp1491351.fifo - /var/tmp/vzdumptmp1491351
CFG: size: 781 name: qemu-server.conf
DEV: dev_id=1 size: 540672 devname: drive-efidisk0
DEV: dev_id=2 size: 8589934592 devname: drive-scsi0
DEV: dev_id=3 size: 5242880 devname: drive-tpmstate0-backup
CTIME: Mon Jun  5 11:08:43 2023
new volume ID is 'net-vz-data:vm-101-disk-1'
new volume ID is 'net-vz-data:vm-101-disk-2'
new volume ID is 'net-vz-data:vm-101-disk-3'
map 'drive-efidisk0' to '/dev/drbd/by-res/vm-101-disk-1/0' (write zeros = 1)
map 'drive-scsi0' to '/dev/drbd/by-res/vm-101-disk-2/0' (write zeros = 1)
map 'drive-tpmstate0-backup' to '/dev/drbd/by-res/vm-101-disk-3/0' (write zeros = 1)
vma: vma_reader_register_bs for stream drive-efidisk0 failed - unexpected size 5242880 != 540672
/bin/bash: line 1: 1491353 Broken pipe             zstd -q -d -c /mnt/pve/net-share-01/dump/vzdump-qemu-101-2023_06_05-11_08_41.vma.zst
     1491354 Trace/breakpoint trap   | vma extract -v -r /var/tmp/vzdumptmp1491351.fifo - /var/tmp/vzdumptmp1491351
temporary volume 'net-vz-data:vm-101-disk-2' sucessfuly removed
temporary volume 'net-vz-data:vm-101-disk-1' sucessfuly removed
temporary volume 'net-vz-data:vm-101-disk-3' sucessfuly removed
no lock found trying to remove 'create'  lock
error before or during data restore, some or all disks were not completely restored. VM 101 state is NOT cleaned up.
TASK ERROR: command 'set -o pipefail && zstd -q -d -c /mnt/pve/net-share-01/dump/vzdump-qemu-101-2023_06_05-11_08_41.vma.zst | vma extract -v -r /var/tmp/vzdumptmp1491351.fifo - /var/tmp/vzdumptmp1491351' failed: exit code 133

I think the problem is somehow tied to the work of ZFS thin-provisoning and related functionality in Linstor, tell me please, maybe I'm doing something wrong

@ggzengel
Copy link

ggzengel commented Aug 8, 2023

Try to put metadata on a separate block device:
LINBIT/linstor-server#128

I use something like this:

linstor controller set-property DrbdOptions/auto-quorum disabled
linstor storage-pool create zfs px1 zfs_12 zpool1/proxmox/drbd
linstor storage-pool create zfs px2 zfs_12 zpool1/proxmox/drbd
linstor storage-pool create diskless px3 zfs_12
linstor resource-group create --storage-pool=zfs_12 --place-count=2 zfs_12
linstor volume-group create zfs_12

linstor sp c lvm px1 zfs_12_meta VG1
linstor sp c lvm px2 zfs_12_meta VG1
linstor sp sp px1 zfs_12_meta StorDriver/LvcreateOptions "-m 1 /dev/disk/by-partlabel/LVM_NVME01 /dev/disk/by-partlabel/LVM_NVME02" 
linstor sp sp px2 zfs_12_meta StorDriver/LvcreateOptions "-m 1 /dev/disk/by-partlabel/LVM_NVME01 /dev/disk/by-partlabel/LVM_NVME02" 
linstor rg sp zfs_12 StorPoolNameDrbdMeta zfs_12_meta
linstor rg sp zfs_12 DrbdMetaType external
linstor rg sp zfs_12 StorDriver/ZfscreateOptions "-o volblocksize=16k" 

@paprikkafox
Copy link
Author

Try to put metadata on a separate block device: LINBIT/linstor-server#128

I use something like this:

linstor controller set-property DrbdOptions/auto-quorum disabled
linstor storage-pool create zfs px1 zfs_12 zpool1/proxmox/drbd
linstor storage-pool create zfs px2 zfs_12 zpool1/proxmox/drbd
linstor storage-pool create diskless px3 zfs_12
linstor resource-group create --storage-pool=zfs_12 --place-count=2 zfs_12
linstor volume-group create zfs_12

linstor sp c lvm px1 zfs_12_meta VG1
linstor sp c lvm px2 zfs_12_meta VG1
linstor sp sp px1 zfs_12_meta StorDriver/LvcreateOptions "-m 1 /dev/disk/by-partlabel/LVM_NVME01 /dev/disk/by-partlabel/LVM_NVME02" 
linstor sp sp px2 zfs_12_meta StorDriver/LvcreateOptions "-m 1 /dev/disk/by-partlabel/LVM_NVME01 /dev/disk/by-partlabel/LVM_NVME02" 
linstor rg sp zfs_12 StorPoolNameDrbdMeta zfs_12_meta
linstor rg sp zfs_12 DrbdMetaType external
linstor rg sp zfs_12 StorDriver/ZfscreateOptions "-o volblocksize=16k" 

I tried, but it does't help neither with volblocksize=16k or 32k.
Also I should mention that pool ssd_zpool1 is a raidz-1 pool (3 disks) with defalut ashift=12, created from Proxmox Web GUI.

@rck
Copy link
Member

rck commented Apr 22, 2024

I have some ideas why this happened in the first place, and by checking if this is the case, I was not able to reproduce it. My best guess is that PVE is no longer that strict with sizes as long as things fit? I saw these at the end of the restore:

VM 103 (scsi0): size of disk 'ontwodistinct:pm-634e6c6c_103' updated from 1G to 1052408K
VM 103 (efidisk0): size of disk 'ontwodistinct:pm-6fbd0488_103' updated from 8152K to 5M
VM 103 (tpmstate0): size of disk 'ontwodistinct:pm-5bf812b9_103' updated from 4M to 8152K

As this issue is already pretty old and I was not longer able to reproduce it, I'm closing this. If this is still an issue with latest LINSTOR and latest linstor-proxmox plugin, feel free to re-open

@rck rck closed this as completed Apr 22, 2024
@modcritical
Copy link

modcritical commented May 20, 2024

I am getting the same issue. This is on my reference infrastructure build so nothing that exotic has been done with it - it's pretty vanilla. After reading the last comment I upgraded everything in the cluster to today's latest packages and tried again; same result.

Here are the versions of everything:

drbd-dkms                            9.2.9-1
drbd-reactor                         1.4.1-1
drbd-utils                           9.28.0-1

linstor-client                       1.22.1-1
linstor-common                       1.27.1-1
linstor-controller                   1.27.1-1
linstor-proxmox                      8.0.2-1
linstor-satellite                    1.27.1-1

proxmox-archive-keyring              3.0
proxmox-backup-client                3.2.2-1
proxmox-backup-file-restore          3.2.2-1
proxmox-backup-restore-image         0.6.1
proxmox-default-headers              1.0.1
proxmox-default-kernel               1.0.1
proxmox-headers-6.5                  6.5.13-5
proxmox-headers-6.5.13-5-pve         6.5.13-5
proxmox-kernel-6.5                   6.5.13-5
proxmox-kernel-6.5.13-5-pve-signed   6.5.13-5
proxmox-kernel-helper                8.1.0
proxmox-mail-forward                 0.2.3
proxmox-mini-journalreader           1.4.0
proxmox-offline-mirror-docs          0.6.6
proxmox-offline-mirror-helper        0.6.6
proxmox-termproxy                    1.0.1
proxmox-ve                           8.2.0
proxmox-websocket-tunnel             0.2.0-1
proxmox-widget-toolkit               4.2.3

@rck Are you restoring to a Linstor storage target when you get these "size ... updated" messages? I see those only when I restore (successfully) to an LVM target. A Linstor target always fails.

Here is the output of a failed restore:

restore vma archive: vma extract -v -r /var/tmp/vzdumptmp31039.fifo /var/lib/vz/dump/vzdump-qemu-104-2024_05_20-18_11_48.vma /var/tmp/vzdumptmp31039
CFG: size: 1180 name: qemu-server.conf
DEV: dev_id=1 size: 540672 devname: drive-efidisk0
DEV: dev_id=2 size: 21474844672 devname: drive-scsi0
CTIME: Mon May 20 18:11:49 2024

NOTICE
  Trying to create diskful resource (pm-46e8e863) on (pve2).
new volume ID is 'essd1-r2:pm-46e8e863_108'

NOTICE
  Trying to create diskful resource (pm-671cd0a6) on (pve2).
new volume ID is 'essd1-r2:pm-671cd0a6_108'
map 'drive-efidisk0' to '/dev/drbd/by-res/pm-46e8e863/0' (write zeros = 1)
map 'drive-scsi0' to '/dev/drbd/by-res/pm-671cd0a6/0' (write zeros = 1)
vma: vma_reader_register_bs for stream drive-efidisk0 failed - unexpected size 5251072 != 540672
temporary volume 'essd1-r2:pm-671cd0a6_108' sucessfuly removed
temporary volume 'essd1-r2:pm-46e8e863_108' sucessfuly removed
no lock found trying to remove 'create'  lock
error before or during data restore, some or all disks were not completely restored. VM 108 state is NOT cleaned up.
TASK ERROR: command 'set -o pipefail && vma extract -v -r /var/tmp/vzdumptmp31039.fifo /var/lib/vz/dump/vzdump-qemu-104-2024_05_20-18_11_48.vma /var/tmp/vzdumptmp31039' failed: got signal 5

And a successful restore of the same backup to an LVM target:

estore vma archive: vma extract -v -r /var/tmp/vzdumptmp35053.fifo /var/lib/vz/dump/vzdump-qemu-104-2024_05_20-18_11_48.vma /var/tmp/vzdumptmp35053
CFG: size: 1180 name: qemu-server.conf
DEV: dev_id=1 size: 540672 devname: drive-efidisk0
DEV: dev_id=2 size: 21474844672 devname: drive-scsi0
CTIME: Mon May 20 18:11:49 2024
  Rounding up size to full physical extent 4.00 MiB
  Logical volume "vm-109-disk-0" created.
new volume ID is 'local-lvm:vm-109-disk-0'
  Rounding up size to full physical extent 20.00 GiB
  Logical volume "vm-109-disk-1" created.
new volume ID is 'local-lvm:vm-109-disk-1'
map 'drive-efidisk0' to '/dev/pve/vm-109-disk-0' (write zeros = 0)
map 'drive-scsi0' to '/dev/pve/vm-109-disk-1' (write zeros = 0)
progress 1% (read 214761472 bytes, duration 0 sec)
progress 2% (read 429522944 bytes, duration 1 sec)
.......
progress 99% (read 21260664832 bytes, duration 5 sec)
progress 100% (read 21475360768 bytes, duration 5 sec)
total bytes read 21475491840, sparse bytes 15185952768 (70.7%)
space reduction due to 4K zero blocks 3.16%
rescan volumes...
VM 109 (scsi0): size of disk 'local-lvm:vm-109-disk-1' updated from 20971528K to 20484M
VM 109 (efidisk0): size of disk 'local-lvm:vm-109-disk-0' updated from 528K to 4M
TASK OK

@modcritical
Copy link

modcritical commented May 21, 2024

Update - I can reproduce this issue when restoring from a backup stored on a node's local storage, but it works fine when restoring backups stored on Proxmox Backup Server to a Linstor target.

When restoring from PBS I do see the messages rck noted:

VM 108 (efidisk0): size of disk 'essd1-r2:pm-2a7cd496_108' updated from 528K to 5128K

@rck
Copy link
Member

rck commented May 21, 2024

@arcandspark

Are you restoring to a Linstor storage target when you get these "size ... updated" messages?

yes, in my case that was a DRBD/LINSTOR disk where the backing storage was LVM, backed up to local LVM, restored to DRBD/LINSTOR with LVM as backing disks.

what type of storage (pool) do you use for the a) the VM (zfs or LVM?) and b) the backup (LVM from what I saw). Is there some LVM vs. ZFS going on?

@modcritical
Copy link

modcritical commented May 21, 2024

In my case it is DRBD/LINSTOR disk backed by ZFS, backed up to local directory, restored to DRBD/LINSTOR with ZFS backing disks.

Where given the below storage config, the VM is backed up from essd1-r2 to local, then the restore attempt is from local to essd1-r2

root@pve1:~# zpool list
NAME    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
cssd1   928G  14.3G   914G        -         -     0%     1%  1.00x    ONLINE  -
essd1   744G  18.1G   726G        -         -     3%     2%  1.00x    ONLINE  -

root@pve1:~# zfs list
NAME                           USED  AVAIL  REFER  MOUNTPOINT
cssd1                         14.3G   885G    96K  /cssd1
cssd1/pm-2b85e5e6_00000       1.76G   885G  1.76G  -
cssd1/pm-516349e2_00000       12.5G   885G  12.5G  -
essd1                         18.1G   703G    24K  /essd1
essd1/pm-2bd8d383_00000       2.41G   703G  2.41G  -
essd1/pm-444ddb3a_00000       2.41G   703G  2.41G  -
essd1/pm-4cd625c1_00000       33.5K   703G  33.5K  -
essd1/pm-72c37a62_00000       2.38G   703G  2.38G  -
essd1/pm-7f2ec680_00000       3.02G   703G  3.02G  -
essd1/pm-91d8ee35_00000       2.40G   703G  2.40G  -
essd1/pm-a0a0ed58_00000       1.82G   703G  1.82G  -
essd1/pm-a75b98dc_00000       31.5K   703G  31.5K  -
essd1/pm-b9f1a565_00000         33K   703G    33K  -
essd1/pm-c29564a5_00000       33.5K   703G  33.5K  -
essd1/pm-c4666574_00000       32.5K   703G  32.5K  -
essd1/pm-c7f922a5_00000       33.5K   703G  33.5K  -
essd1/pm-e6b468ee_00000         30K   703G    30K  -
essd1/pm-f0b4a3b2_00000       3.58G   703G  3.58G  -
essd1/vm-100-cloudinit_00000    17K   703G    17K  -
essd1/vm-101-cloudinit_00000    17K   703G    17K  -
essd1/vm-103-cloudinit_00000    17K   703G    17K  -

root@pve1:~# cat /etc/pve/storage.cfg
dir: local
        path /var/lib/vz
        content backup,vztmpl,iso

lvmthin: local-lvm
        thinpool data
        vgname pve
        content images,rootdir

drbd: essd1-r2
        resourcegroup essd1-r2
        apica /etc/linstor/ssl/controller-api.pem
        apicrt /etc/linstor/ssl/client-cert.pem
        apikey /etc/linstor/ssl/client-key.pem
        content rootdir,images
        controller pve1,pve2,pve3

drbd: cssd1-r2
        resourcegroup cssd1-r2
        apica /etc/linstor/ssl/controller-api.pem
        apicrt /etc/linstor/ssl/client-cert.pem
        apikey /etc/linstor/ssl/client-key.pem
        content images,rootdir
        controller pve1,pve2,pve3

pbs: backup1
        datastore backup1
        server pbs.dev
        content backup
        fingerprint EC:7C:89:DC:2D:B1:02:D2:93:64:12:DB:F3:7B:DA:90:17:BD:9A:37:47:5B:22:B2:CE:2A:B2:50:89:68:2A:8E
        username root@pam!pve

root@pve1:~# linstor sp list
╭─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
┊ StoragePool          ┊ Node ┊ Driver   ┊ PoolName ┊ FreeCapacity ┊ TotalCapacity ┊ CanSnapshots ┊ State ┊ SharedName                ┊
╞═════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
┊ DfltDisklessStorPool ┊ pve1 ┊ DISKLESS ┊          ┊              ┊               ┊ False        ┊ Ok    ┊ pve1;DfltDisklessStorPool ┊
┊ DfltDisklessStorPool ┊ pve2 ┊ DISKLESS ┊          ┊              ┊               ┊ False        ┊ Ok    ┊ pve2;DfltDisklessStorPool ┊
┊ DfltDisklessStorPool ┊ pve3 ┊ DISKLESS ┊          ┊              ┊               ┊ False        ┊ Ok    ┊ pve3;DfltDisklessStorPool ┊
┊ cssd1                ┊ pve1 ┊ ZFS_THIN ┊ cssd1    ┊   885.00 GiB ┊       928 GiB ┊ True         ┊ Ok    ┊ pve1;cssd1                ┊
┊ cssd1                ┊ pve2 ┊ ZFS_THIN ┊ cssd1    ┊   897.49 GiB ┊       928 GiB ┊ True         ┊ Ok    ┊ pve2;cssd1                ┊
┊ cssd1                ┊ pve3 ┊ ZFS_THIN ┊ cssd1    ┊   886.75 GiB ┊       928 GiB ┊ True         ┊ Ok    ┊ pve3;cssd1                ┊
┊ data                 ┊ pve1 ┊ LVM_THIN ┊ pve/data ┊   611.67 GiB ┊    611.73 GiB ┊ True         ┊ Ok    ┊ pve1;data                 ┊
┊ data                 ┊ pve2 ┊ LVM_THIN ┊ pve/data ┊   141.17 GiB ┊    141.23 GiB ┊ True         ┊ Ok    ┊ pve2;data                 ┊
┊ data                 ┊ pve3 ┊ LVM_THIN ┊ pve/data ┊   141.23 GiB ┊    141.23 GiB ┊ True         ┊ Ok    ┊ pve3;data                 ┊
┊ essd1                ┊ pve1 ┊ ZFS_THIN ┊ essd1    ┊   702.78 GiB ┊       744 GiB ┊ True         ┊ Ok    ┊ pve1;essd1                ┊
┊ essd1                ┊ pve2 ┊ ZFS_THIN ┊ essd1    ┊   710.32 GiB ┊       744 GiB ┊ True         ┊ Ok    ┊ pve2;essd1                ┊
┊ essd1                ┊ pve3 ┊ ZFS_THIN ┊ essd1    ┊   710.36 GiB ┊       744 GiB ┊ True         ┊ Ok    ┊ pve3;essd1                ┊
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

@rck
Copy link
Member

rck commented Jun 3, 2024

thank you for the very detailed and helpful logs, and sorry this took a bit longer for a response... I have an idea and will try to reproduce it in my dev env.

@rck rck reopened this Jun 3, 2024
@rck
Copy link
Member

rck commented Jun 6, 2024

"unfortunately" I still can not reproduce this. First I thought it might be a (block)size issue between zfs and lvm, but backup+restore worked as expected. then I thought it might be the EFI disk, but still:

progress 99% (read 2126053376 bytes, duration 15 sec)
progress 100% (read 2147483648 bytes, duration 15 sec)
total bytes read 2147549184, sparse bytes 2060103680 (95.9%)
space reduction due to 4K zero blocks 2.22%
rescan volumes...
VM 105 (efidisk0): size of disk 'tank:pm-906af5bd_105' updated from 5128K to 5M
TASK OK

@arcandspark can you reproduce it with a fresh dummy VM, no efi disk, no "funny things" like snapshots or resizing. create, backup, restore. still failing?

@modcritical
Copy link

I prepared two identically installed Debian 12 VMs, except one is a Q35/SeaBIOS VM and the other is a Q35/OVMF EFI VM. Each VM has a single disk on the Linstor pool (essd-r2, backed by ZFS). The EFI VM has an EFI disk also on the Linstor pool (essd-r2). I created a backup of each to the local LVM storage of the same node. I then restored each on that node from the backup.

The BIOS VM succeeded, the EFI VM failed when creating the EFI disk:

BIOS VM Restore Task:

restore vma archive: vma extract -v -r /var/tmp/vzdumptmp1726470.fifo /var/lib/vz/dump/vzdump-qemu-106-2024_06_06-14_02_16.vma /var/tmp/vzdumptmp1726470
CFG: size: 524 name: qemu-server.conf
DEV: dev_id=1 size: 21474844672 devname: drive-scsi0
CTIME: Thu Jun  6 14:02:21 2024

NOTICE
  Trying to create diskful resource (pm-fa737626) on (pve1).
new volume ID is 'essd1-r2:pm-fa737626_106'
map 'drive-scsi0' to '/dev/drbd/by-res/pm-fa737626/0' (write zeros = 1)
progress 1% (read 214761472 bytes, duration 1 sec)
... ... ...
progress 100% (read 21474836480 bytes, duration 104 sec)
total bytes read 21474902016, sparse bytes 17488576512 (81.4%)
space reduction due to 4K zero blocks 3.94%
rescan volumes...
VM 106 (scsi0): size of disk 'essd1-r2:pm-fa737626_106' updated from 20G to 20971528K
TASK OK

EFI VM Restore Task:

restore vma archive: vma extract -v -r /var/tmp/vzdumptmp1729754.fifo /var/lib/vz/dump/vzdump-qemu-108-2024_06_11-14_16_32.vma /var/tmp/vzdumptmp1729754
CFG: size: 625 name: qemu-server.conf
DEV: dev_id=1 size: 540672 devname: drive-efidisk0
DEV: dev_id=2 size: 21474844672 devname: drive-scsi0
CTIME: Tue Jun 11 14:16:35 2024

NOTICE
  Trying to create diskful resource (pm-e5762572) on (pve1).
new volume ID is 'essd1-r2:pm-e5762572_109'

NOTICE
  Trying to create diskful resource (pm-2aaf4174) on (pve1).
new volume ID is 'essd1-r2:pm-2aaf4174_109'
map 'drive-efidisk0' to '/dev/drbd/by-res/pm-e5762572/0' (write zeros = 1)
map 'drive-scsi0' to '/dev/drbd/by-res/pm-2aaf4174/0' (write zeros = 1)
vma: vma_reader_register_bs for stream drive-efidisk0 failed - unexpected size 5251072 != 540672
temporary volume 'essd1-r2:pm-e5762572_109' sucessfuly removed
temporary volume 'essd1-r2:pm-2aaf4174_109' sucessfuly removed
no lock found trying to remove 'create'  lock
error before or during data restore, some or all disks were not completely restored. VM 109 state is NOT cleaned up.
TASK ERROR: command 'set -o pipefail && vma extract -v -r /var/tmp/vzdumptmp1729754.fifo /var/lib/vz/dump/vzdump-qemu-108-2024_06_11-14_16_32.vma /var/tmp/vzdumptmp1729754' failed: got signal 5

@rck
Copy link
Member

rck commented Jun 28, 2024

once more thanks for the detailed info. I always tested with alpine images, I did check the EFI disk box... whatever did the trick, debian or q35, I now could reproduce it. the problem is:

DEV: dev_id=1 size: 540672 devname: drive-efidisk0

these are bytes, so that makes 528K.

vma: vma_reader_register_bs for stream drive-efidisk0 failed - unexpected size 5251072 != 540672

5251072 bytes are exactly 5M. DRBD devices have a lower limit, and 5M looked like a good lower limit to me. then add the usual rounding from LINSTOR, different block sizes, obscure vma behavior and you are there. it looks like this strict size check only triggers if the disk has a certain size. if I use a lower minimum size (3M instead of 5M) then things work. I will have to think about a proper fix. but the problem is now fully understood, thanks.

@rck
Copy link
Member

rck commented Jul 1, 2024

this should be fixe in 2dfcc49 . @arcandspark can you confirm this fixes the issue for you. just replace the file/the line and maybe systemctl restart pvedaemon.

@rck rck transferred this issue from LINBIT/linstor-server Jul 1, 2024
@rck
Copy link
Member

rck commented Jul 9, 2024

@arcandspark did you have a chance to test the proposed fix?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants