Re: [RFC PATCH] mm, hotplug: get rid of auto_online_blocks

From: Michal Hocko
Date: Mon Mar 13 2017 - 09:19:44 EST

On Mon 13-03-17 13:54:59, Vitaly Kuznetsov wrote:
> Michal Hocko <mhocko@xxxxxxxxxx> writes:
> > On Mon 13-03-17 11:55:54, Igor Mammedov wrote:
> >> > >
> >> > > - suggested RFC is not acceptable from virt point of view
> >> > > as it regresses guests on top of x86 kvm/vmware which
> >> > > both use ACPI based memory hotplug.
> >> > >
> >> > > - udev/userspace solution doesn't work in practice as it's
> >> > > too slow and unreliable when system is under load which
> >> > > is quite common in virt usecase. That's why auto online
> >> > > has been introduced in the first place.
> >> >
> >> > Please try to be more specific why "too slow" is a problem. Also how
> >> > much slower are we talking about?
> >>
> >> In virt case on host with lots VMs, userspace handler
> >> processing could be scheduled late enough to trigger a race
> >> between (guest memory going away/OOM handler) and memory
> >> coming online.
> >
> > Either you are mixing two things together or this doesn't really make
> > much sense. So is this a balloning based on memory hotplug (aka active
> > memory hotadd initiated between guest and host automatically) or a guest
> > asking for additional memory by other means (pay more for memory etc.)?
> > Because if this is an administrative operation then I seriously question
> > this reasoning.
> I'm probably repeating myself but it seems this point was lost:
> This is not really a 'ballooning', it is just a pure memory
> hotplug. People may have any tools monitoring their VM memory usage and
> when a VM is running low on memory they may want to hotplug more memory
> to it.

What is the API those guests ask for the memory? And who is actually
responsible to ask for that memory? Is it a kernel or userspace

> With udev-style memory onlining they should be aware of page
> tables and other in-kernel structures which require allocation so they
> need to add memory slowly and gradually or they risk running into OOM
> (at least getting some processes killed and these processes may be
> important). With in-kernel memory hotplug everything happens
> synchronously and no 'slowly and gradually' algorithm is required in
> all tools which may trigger memory hotplug.

What prevents those APIs being used reasonably and only asks so much
memory as they can afford? I mean 1.5% available memory necessary for
the hotplug is not all that much. Or more precisely what prevents to ask
for this additional memory in a synchronous way?

And just to prevent from a further solution, I can see why _in-kernel_
hotplug based 'ballooning (or whatever you call an on demand memory hotplug
based on the memory pressure)' want's to be synchronous and that is why
my patch changed those onlined the memory explicitly. I am questioning
memory hotplug requested by admin/user space component to need any
special from kernel assistance becuase it is only a shortcut which can
be implemented from the userspace. I hope I've made myself clear
Michal Hocko