Re: [PATCH] virtio_balloon: stay awake while adjusting balloon

From: David Hildenbrand
Date: Mon Dec 18 2023 - 06:40:59 EST


On 14.12.23 05:13, David Stevens wrote:
On Wed, Dec 13, 2023 at 5:44 PM David Hildenbrand <david@xxxxxxxxxx> wrote:

On 11.12.23 12:43, David Stevens wrote:
From: David Stevens <stevensd@xxxxxxxxxxxx>


Hi David,

Add a wakeup event for when the balloon is inflating or deflating.
Userspace can enable this wakeup event to prevent the system from
suspending while the balloon is being adjusted. This allows
/sys/power/wakeup_count to be used without breaking virtio_balloon's
cooperative memory management.

Can you add/share some more details

I'm working on enabling support for Linux s2Idle in our Android
virtual machine, to restrict apps from running in the background
without holding an Android partial wakelock. With the patch I recently
sent out [1], since crosvm advertises native PCI power management for
virtio devices, the Android guest can properly enter s2idle, and it
can be woken up by incoming IO. However, one of the remaining problems
is that when the host needs to reclaim memory from the guest via the
virtio-balloon, there is nothing preventing the guest from entering
s2idle before the balloon driver finishes returning memory to the
host.

Thanks for the information. So you also want to wakeup the VM when wanting to get more memory from the VM?

Using which mechanism would that wakeup happen? Likely not the device itself?


One alternative to this approach would be to add a virtballoon_suspend
callback to abort suspend if the balloon is inflating/adjusting.
However, it seems cleaner to just prevent suspend in the first place.

Also, the PM notifier could also be used with very high priority, so the device would respond early to PM_SUSPEND_PREPARE.

[...]

queue_work(system_freezable_wq, work);
+ else
+ end_update_balloon_size(vb, seqno);

What if we stop the workqueue and unload the driver -- see
remove_common() -- won't you leave pm_stay_awake() wrongly set?

When a device gets removed, its wakeup source is destroyed, which
automatically calls __pm_relax.

Ah, thanks.


}

static int init_vqs(struct virtio_balloon *vb)
@@ -992,6 +1028,9 @@ static int virtballoon_probe(struct virtio_device *vdev)
goto out_unregister_oom;
}

+ spin_lock_init(&vb->adjustment_lock);
+ device_set_wakeup_capable(&vb->vdev->dev, true);


I'm a bit confused: Documentation/driver-api/pm/devices.rst documents

"
The :c:member:`power.can_wakeup` flag just records whether the device
(and its driver) can physically support wakeup events. The
:c:func:`device_set_wakeup_capable()` routine affects this flag.
"

...

"
Whether or not a device is capable of issuing wakeup events is a
hardware matter, and the kernel is responsible for keeping track of it.
"

But how is the virtio-balloon device capable of waking up the machine?
Your patch merely implies that the virtio-baloon device is capable to
prohbit going to sleep.

What am I missing?

The underlying virtio_pci_device is capable of waking up the machine,
if it supports PCI power management. The core PCI code will keep the
machine awake while processing the interrupt (i.e. during
virtballoon_changed), but after processing is handed off to the
virtio-balloon driver, the virtio-balloon driver needs to keep the
machine awake until the processing is actually completed.

An alternative to making vb->vdev->dev wakeup capable is to plumb the
pm_stay_awake/pm_relax calls to the underlying virtio_pci_device. Would
that be a preferable approach?

The way you describe it, it rather belongs into the PCI code, because that's what actually makes the device PM-capable (i.e., would not apply to virtio-ccw or virtio-mmio). The virtio-balloon itself is not PM-capable. But how hard is it to move that handling into PCI specific code?

--
Cheers,

David / dhildenb