Re: [RFC] memory pressure detection in VMs using PSI mechanism for dynamically inflating/deflating VM memory

From: David Hildenbrand
Date: Mon Jan 23 2023 - 05:00:29 EST



1. This will be a native userspace daemon that will be running only in
the Linux VM which will use virtio-mem driver that uses memory hotplug
to add/remove memory. The VM (aka Secondary VM, SVM) will request for
memory from the host which is Primary VM, PVM via the backend hypervisor
which takes care of cross-VM communication.

2. This will be guest driver. This daemon will use PSI mechanism to
monitor memory pressure to keep track of memory demands in the system.
It will register to few memory pressure events and make an educated
guess on when demand for memory in system is increasing.

Is that running in the primary or the secondary VM?

The userspace PSI daemon will be running on secondary VM. It will talk
to a kernel driver (running on secondary VM itself) via ioctl. This
kernel driver will talk to slightly modified version of virtio-mem
driver where it can call the virtio_mem_config_changed(virtiomem_device)
function for resizing the secondary VM. So its mainly "guest driven" now.

Okay, thanks.

[...]


This daemon is currently in just Beta stage now and we have basic
functionality running. We are yet to add more flesh to this scheme to

Good to hear that the basics are running with virtio-mem (I assume :) ).

make sure any potential risks or security concerns are taken care as
well.

It would be great to draw/explain the architecture in more detail.

We will be looking into solving any potential security concerns where
hypervisor would restrict few actions of resizing of memory. Right now,
we are experimenting to see if PSI mechanism itself can be used for ways
of detecting memory pressure in the system and add memory to secondary
VM when memory is in need. Taking into account all the latencies
involved in the PSI scheme (i.e. time when one does malloc call till
when extra memory gets added to SVM system). And wanted to know
upstream's opinion on such a scheme using PSI mechanism for detecting
memory pressure and resizing SVM accordingly.

One problematic thing is that adding memory to Linux by virtio-mem eventually consumes memory (e.g., the memmap), especially when having to to add a completely new memory block to Linux.

So if you're already under severe memory pressure, these allocations to bring up new memory can fail. The question is, if PSI can notify "early" enough such that this barely happens in practice.

There are some possible ways to mitigate:

1) Always keep spare memory blocks by virtio-mem added to Linux, that
don't expose any memory yet. Memory from these block can be handed
over to Linux without additional Linux allocations. Of course, they
consume metadata, so one might want to limit them.

2) Implement memmap_on_memory support for virtio-mem. This might help in
some setups, where the device block size is suitable.

Did you run into that scenario already during your experiments, and how did you deal with that?

--
Thanks,

David / dhildenb