Re: [RFC PATCH] mm, hotplug: get rid of auto_online_blocks

From: Vitaly Kuznetsov
Date: Mon Mar 13 2017 - 09:43:16 EST


Michal Hocko <mhocko@xxxxxxxxxx> writes:

> On Mon 13-03-17 13:54:59, Vitaly Kuznetsov wrote:
>> Michal Hocko <mhocko@xxxxxxxxxx> writes:
>>
>> > On Mon 13-03-17 11:55:54, Igor Mammedov wrote:
>> >> > >
>> >> > > - suggested RFC is not acceptable from virt point of view
>> >> > > as it regresses guests on top of x86 kvm/vmware which
>> >> > > both use ACPI based memory hotplug.
>> >> > >
>> >> > > - udev/userspace solution doesn't work in practice as it's
>> >> > > too slow and unreliable when system is under load which
>> >> > > is quite common in virt usecase. That's why auto online
>> >> > > has been introduced in the first place.
>> >> >
>> >> > Please try to be more specific why "too slow" is a problem. Also how
>> >> > much slower are we talking about?
>> >>
>> >> In virt case on host with lots VMs, userspace handler
>> >> processing could be scheduled late enough to trigger a race
>> >> between (guest memory going away/OOM handler) and memory
>> >> coming online.
>> >
>> > Either you are mixing two things together or this doesn't really make
>> > much sense. So is this a balloning based on memory hotplug (aka active
>> > memory hotadd initiated between guest and host automatically) or a guest
>> > asking for additional memory by other means (pay more for memory etc.)?
>> > Because if this is an administrative operation then I seriously question
>> > this reasoning.
>>
>> I'm probably repeating myself but it seems this point was lost:
>>
>> This is not really a 'ballooning', it is just a pure memory
>> hotplug. People may have any tools monitoring their VM memory usage and
>> when a VM is running low on memory they may want to hotplug more memory
>> to it.
>
> What is the API those guests ask for the memory? And who is actually
> responsible to ask for that memory? Is it a kernel or userspace
> solution?

Whatever, this can even be a system administrator running
'free'. Hyper-V driver sends si_mem_available() and
vm_memory_committed() metrics to the host every second and this can be
later queried by any tool (e.g. powershell script).

>
>> With udev-style memory onlining they should be aware of page
>> tables and other in-kernel structures which require allocation so they
>> need to add memory slowly and gradually or they risk running into OOM
>> (at least getting some processes killed and these processes may be
>> important). With in-kernel memory hotplug everything happens
>> synchronously and no 'slowly and gradually' algorithm is required in
>> all tools which may trigger memory hotplug.
>
> What prevents those APIs being used reasonably and only asks so much
> memory as they can afford? I mean 1.5% available memory necessary for
> the hotplug is not all that much. Or more precisely what prevents to ask
> for this additional memory in a synchronous way?

The knowledge about the fact that we need to add memory slowly and
wait till it gets onlined is not obvious. AFAIR when you hotplug memory
to Windows VMs there is no such thing as 'onlining', and no brain is
required, a simple script 'low memory -> add mory memory' always
works. Asking all these script writers to think twice before issuing a
memory add command memory sounds like too much (to me).

--
Vitaly