RE: [PATCH 1/1] Documentation: hyperv: Add overview of guest VM hibernation
From: Michael Kelley
Date: Fri Dec 13 2024 - 15:43:54 EST
From: Roman Kisel <romank@xxxxxxxxxxxxxxxxxxx> Sent: Friday, December 13, 2024 10:44 AM
>
> On 12/12/2024 3:17 PM, mhkelley58@xxxxxxxxx wrote:
> > From: Michael Kelley <mhklinux@xxxxxxxxxxx>
> >
> > Add documentation on how hibernation works in a guest VM on Hyper-V.
> > Describe how VMBus devices and the VMBus itself are hibernated and
> > resumed, along with various limitations.
> >
[snip]
> > +Considerations for Guest VM Hibernation
> > +---------------------------------------
> > +Linux guests on Hyper-V can also be hibernated, in which case the
> > +hardware is the virtual hardware provided by Hyper-V to the guest VM.
> > +Only the targeted guest VM is hibernated, while other guest VMs and
> > +the underlying Hyper-V host continue to run normally. While the
> > +underlying Windows Hyper-V and physical hardware on which it is
> > +running might also be hibernated using hibernation functionality in
> > +the Windows host, host hibernation and its impact on guest VMs is not
> > +in scope for this documentation.
> > +
> > +Resuming a hibernated guest VM can be more challenging than with
> > +physical hardware because VMs make it very easy to change the hardware
> > +configuration between the hibernation and resume. Even when the resume
> > +is done on the same VM that hibernated, the memory size might be
> > +changed, or virtual NICs or SCSI controllers might be added or
> > +removed. Virtual PCI devices assigned to the VM might be added or
> > +removed. Most such changes cause the resume steps to fail, though
> > +adding a new virtual NIC, SCSI controller, or vPCI device should work.
> > +
>
> Would it be useful mentioning the (likely lethal for the VM) risk
> of copying the hibernated VM to another host (of the same arch) that has
> another set of CPUID bits/features?
Yes, that's a good point that is specific to VMs. I'll add it to the
documentation.
[snip]
>
> Appreciated documenting all the intricacies of the hibernation and
> resume paths for various devices, an incredible read! Are there
> any special considerations known to you for the hibernation of
> the devices driven through the Hyper-V UIO?
>
The UIO driver for VMBus devices (uio_hv_generic.c) does not have
support for hibernation -- it does not have "suspend" and "resume"
functions implemented like the other VMBus device drivers.
Consequently, vmbus_suspend() returns an EOPNOTSUPP error (-95)
when Linux goes through the hibernation sequence. The error causes
the sequence to abort, and the VM is not hibernated.
FWIW, here's example output:
[86945.335293] PM: hibernation: hibernation entry
[86945.344403] Filesystems sync: 0.008 seconds
[86945.344853] Freezing user space processes
[86945.346331] Freezing user space processes completed (elapsed 0.001 seconds)
[86945.346340] OOM killer disabled.
[86945.346410] PM: hibernation: Marking nosave pages: [mem 0x00000000-0x00000fff]
[86945.346412] PM: hibernation: Marking nosave pages: [mem 0x000a0000-0x000fffff]
[86945.346413] PM: hibernation: Marking nosave pages: [mem 0xee9b4000-0xee9bafff]
[86945.346414] PM: hibernation: Marking nosave pages: [mem 0xeff41000-0xeffc4fff]
[86945.346415] PM: hibernation: Marking nosave pages: [mem 0xeffd3000-0xefffefff]
[86945.346416] PM: hibernation: Marking nosave pages: [mem 0xf0000000-0xffffffff]
[86945.346649] PM: hibernation: Basic memory bitmaps created
[86945.346659] PM: hibernation: Preallocating image memory
[86946.173088] PM: hibernation: Allocated 611162 pages for snapshot
[86946.173105] PM: hibernation: Allocated 2444648 kbytes in 0.82 seconds (2981.27 MB/s)
[86946.173114] Freezing remaining freezable tasks
[86946.174472] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[86946.174568] printk: Suspending console(s) (use no_console_suspend to debug)
[86946.205965] uio_hv_generic 1fe3f3ee-76ed-4bfe-871f-984d1563c7a2: PM: dpm_run_callback(): vmbus_suspend [hv_vmbus] returns -95
[86946.205989] uio_hv_generic 1fe3f3ee-76ed-4bfe-871f-984d1563c7a2: PM: failed to freeze noirq: error -95
[86946.206812] PM: hibernation: Some devices failed to power down, aborting
[86946.246700] PM: hibernation: Basic memory bitmaps freed
[86946.247759] OOM killer enabled.
[86946.247766] Restarting tasks ... done.
[86946.249415] PM: hibernation: hibernation exit
I'll add this limitation to the documentation as well. Given the plans
to use the UIO driver for a broader set of specialty VMBus devices,
this is a limitation that likely needs to be remedied.
Thanks for the input! These are useful points that I had not
considered.
Michael