Re: [PATCH v5 1/1] Create debugfs file with virtio balloon usage information

From: David Hildenbrand
Date: Mon Jul 25 2022 - 07:37:05 EST


On 25.07.22 13:27, Alexander Atanasov wrote:
> Hi,
>
> On 18/07/2022 14:35, David Hildenbrand wrote:
>> On 14.07.22 15:20, Alexander Atanasov wrote:
>>> Allow the guest to know how much it is ballooned by the host.
>>> It is useful when debugging out of memory conditions.
>>>
>>> When host gets back memory from the guest it is accounted
>>> as used memory in the guest but the guest have no way to know
>>> how much it is actually ballooned.
>>>
>>> Signed-off-by: Alexander Atanasov <alexander.atanasov@xxxxxxxxxxxxx>
>>> ---
>>> drivers/virtio/virtio_balloon.c | 79 +++++++++++++++++++++++++++++
>>> include/uapi/linux/virtio_balloon.h | 1 +
>>> 2 files changed, 80 insertions(+)
>>>
>>> V2:
>>> - fixed coding style
>>> - removed pretty print
>>> V3:
>>> - removed dublicate of features
>>> - comment about balooned_pages more clear
>>> - convert host pages to balloon pages
>>> V4:
>>> - added a define for BALLOON_PAGE_SIZE to make things clear
>>> V5:
>>> - Made the calculatons work properly for both ways of memory accounting
>>> with or without deflate_on_oom
>>> - dropped comment
>>>
> [....]
>>> + u32 num_pages, total_pages, current_pages;
>>> + struct sysinfo i;
>>> +
>>> + si_meminfo(&i);
>>> +
>>> + seq_printf(f, "%-22s: %d\n", "page_size", VIRTIO_BALLOON_PAGE_SIZE);
>> Why? Either export all in ordinary page size or in kB. No need to
>> over-complicate the interface with a different page size that users
>> don't actually care about.
>>
>> I'd just stick to /proc/meminfo and use kB.
>
> The point is to expose some internal balloon data. Balloon works with
> pages not with kB  and users can easily calculate kB.

Pages translate to KB. kB are easy to consume by humans instead of pages
with variable apge sizes

>
>>> +
>>> + seq_printf(f, "%-22s: %u\n", "ballooned_pages", num_pages);
>>> +
>>> + if (virtio_has_feature(b->vdev, VIRTIO_BALLOON_F_DEFLATE_ON_OOM)) {
>>> + total_pages = guest_to_balloon_pages(i.totalram);
>>> + current_pages = guest_to_balloon_pages(i.totalram) - num_pages;
>>> + } else {
>>> + total_pages = guest_to_balloon_pages(i.totalram) + num_pages;
>>> + current_pages = guest_to_balloon_pages(i.totalram);
>>> + }
>>> +
>>> + /* Total Memory for the guest from host */
>>> + seq_printf(f, "%-22s: %u\n", "total_pages", total_pages);
>>> +
>>> + /* Current memory for the guest */
>>> + seq_printf(f, "%-22s: %u\n", "current_pages", current_pages);
>> The think I detest about that interface (total/current) is that it's
>> simply not correct -- because i.totalram for example doesn't include
>> things like (similar to MemTotal in /proc/meminfo)
>>
>> a) crashkernel
>> b) early boot allocations (e.g., memmap)
>> c) any kind of possible memory (un)hotplug in the future
>>
>> I'd really suggest to just KIS and not mess with i.totalram.
>>
>> Exposing how much memory is inflated and some way to identify how that
>> memory is accounted in /proc/meminfo should be good enough.
>>
>> E.g., the output here could simply be
>>
>> "Inflated: 1024 kB"
>> "MemTotalReduced: 1024 kB"
>>
>> That would even allow in the future for flexibility when it comes to how
>> much/what was actually subtracted from MemTotal.
>
>
> The point of the debug interface is to expose some of the balloon driver
> internals to the guest.
>
> The users of this are user space processes that try to work according to
> the memory pressure and nested virtualization.
>
> I haven't seen one userspace software that expects total ram to change,
> it should be constant with one exception hotplug. But if you do  hotplug
> RAM you know that and you can restart what you need to restart.
>
> So instead of messing with totalram with adding or removing pages /it
> would even be optimization since now it is done page by page/ i suggest
> to just account the inflated memory as used as in the deflate_on_oom
> case now. It is confusing and inconsistent as it is now. How to  explain
> to a vps user why his RAM is constantly changing?
>
> And the file can go just to
>
> inflated: <pages>
>
> ballon_page_size:  4096
>
> or
>
> inflated: kB
>
> I prefer pages because on theory guest and host can different size and
> if at some point guest starts to make requests to the host for pages it
> must somehow know what the host/balloon page is. Since you have all the
> information at one place it is easy to calculate kB. But you can not
> calculate pages from only kB.

The host can only inflate in guest-page base sizes. How that translates
to host-page base sizes is absolutely irrelevant.

Who should care about pages? Absolutely irrelevant.

Guest pages: kB / getpagesize()
Logical balloon pages: kB / 4k
Host pages: ???

>
> Later on when hotplug comes it can add it's own data to the file so it
> can be used to see the amount that is added to the total ram.
>
> It can add
>
> hotplugged: <pages>
>
>
> What do you think?

Let's keep this interface simple, please.

--
Thanks,

David / dhildenb