Symptom

No common base snapshot on volume(s) <STORAGE_ID>:vm-<VMID>-disk-0

Cause: The target still has an old replica (or its snapshots were pruned), so the source can’t send incrementally.


A) Quick checklist (safe for running VMs)

  1. Confirm where the VM disk lives (source node)

 
qm config <VMID> | grep -E '^(scsi|virtio|sata|ide)[0-9]:' # expect: <bus>: <STORAGE_ID>:vm-<VMID>-disk-0,...
  1. Verify storage mapping (both nodes)

 
grep -A8 -n '<STORAGE_ID>' /etc/pve/storage.cfg # type should be zfspool, content includes 'images', and nodes include both source & target
  1. Clean the target replica (on TARGET node)

 
# See if the dataset exists zfs list -t all | grep vm-<VMID>-disk-0 || echo "no replica (ok)" # Safer (keep a backup): zfs rename <POOL>/vm-<VMID>-disk-0 <POOL>/vm-<VMID>-disk-0-orphan-$(date +%s) # Or clean (delete it): # zfs destroy -r <POOL>/vm-<VMID>-disk-0
  1. Delete the replication job (on SOURCE node)

 
qm unlock <VMID> # Try CLI: pvesr delete <VMID>-0 || true # If still present in GUI, edit the config: cp /etc/pve/replication.cfg /etc/pve/replication.cfg.bak nano /etc/pve/replication.cfg # remove the block: # job: <VMID>-0 # type: guest # guest: <VMID> # target: <TARGET_NODE> # storage: <STORAGE_ID> # schedule: ...

If it won’t save: check cluster quorum → pvecm status (must have quorum to edit).

  1. (Optional) Clear stale job state

 
rm -f /var/lib/pve-manager/pvesr/<VMID>-0
  1. Recreate & seed a new base (SOURCE node, GUI)

  • VM → Replication → Add

    • Target node: <TARGET_NODE>

    • Target storage: <STORAGE_ID>

    • Set schedule & Rate limit (helpful during business hours)

  • Click Resync (or Run now) → a full initial send runs, then incrementals resume.


B) Verification & logs

 
pvesr status journalctl -u pvescheduler -e # Proxmox 8.x scheduler zpool status # pool health zfs list -t snapshot | grep vm-<VMID>-disk-0 # (source) replication snaps will appear after first run

C) Notes & gotchas

  • Does this stop the VM? No. The VM keeps running; initial resync just adds I/O & network load.

  • Two defaults gateways / mis-mapped storage IDs can cause weirdness. Ensure the same storage ID is enabled on both nodes and points to an existing ZFS pool on each.

  • Don’t manually delete replication snapshots (those named like @replicate-<VMID>-0-…) on either side.

  • If the disk bus or storage changes (e.g., move_disk), it’s normal to recreate the replication job.


D) One-copy/paste template (replace ALL placeholders)

 
# VARS (edit these) VMID=252105 STORAGE_ID=Storage-CLST1 TARGET_NODE=SRV250 TARGET_POOL=tank # zpool name on TARGET # 1) Source checks qm unlock $VMID qm config $VMID | grep -E '^(scsi|virtio|sata|ide)[0-9]:' # 2) Target clean-up ssh $TARGET_NODE "zfs list -t all | grep vm-${VMID}-disk-0 || true" ssh $TARGET_NODE "zfs destroy -r ${TARGET_POOL}/vm-${VMID}-disk-0 2>/dev/null || true" # 3) Remove old job (source) pvesr delete ${VMID}-0 2>/dev/null || true cp /etc/pve/replication.cfg /etc/pve/replication.cfg.bak # (manually edit /etc/pve/replication.cfg if needed and remove the 'job: ${VMID}-0' block) rm -f /var/lib/pve-manager/pvesr/${VMID}-0 # 4) Recreate via GUI and click Resync (full send)

E) Prevention tips

  • Keep replication retention ≥ your run interval so common base snapshots stick around.

  • Avoid manual snapshot deletion of replicate-* snapshots.

  • After storage moves or pool renames, recreate the job.

  • Monitor Datacenter → Tasks and pvesr status regularly.

Thank you W3DATA Cloud Tech support Team

Hjälpte svaret dig? 0 användare blev hjälpta av detta svar (0 Antal röster)