Re: [RFC PATCH] mm, hotplug: get rid of auto_online_blocks

From: Michal Hocko
Date: Mon Feb 27 2017 - 05:33:51 EST


On Mon 27-02-17 11:02:09, Vitaly Kuznetsov wrote:
[...]
> I don't have anything new to add to the discussion happened last week
> but I'd like to summarize my arguments against this change:
>
> 1) This patch doesn't solve any issue. Configuration option is not an
> issue by itself, it is an option for distros to decide what they want to
> ship: udev rule with known issues (legacy mode) or enable the new
> option. Distro makers and users building their kernels should be able to
> answer this simple question "do you want to automatically online all
> newly added memory or not".

OK, so could you be more specific? Distributions have no clue about
which HW their kernel runs on so how can they possibly make a sensible
decision here?

> There are distros already which ship kernels
> with CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE enabled (Fedora 24 and 25 as
> far as I remember, maybe someone else).
>
> 2) This patch creates an imbalance between Xen/Hyper-V on one side and
> KVM/Vmware on another. KVM/Vmware use pure ACPI memory hotplug and this
> memory won't get onlined. I don't understand how this problem is
> supposed to be solved by distros. They'll *have to* continue shipping
> a udev rule which has and always will have issues.

They have notifications for udev to online that memory and AFAICU
neither KVM nor VMware are using memory hotplut for ballooning - unlike
HyperV and Xen.

> 3) Kernel command line is not a viable choice, it is rather a debug
> method.

Why?

> Having all newly added memory online as soon as possible is a
> major use-case not something a couple of users wants (and this is
> proved by major distros shipping the unconditional 'offline->online'
> rule with udev).

I would argue because this really depends on the usecase. a) somebody
might want to online memory as movable and that really depends on which
node we are talking about because not all of them can be movable b) it
is easier to handle potential errors from userspace than the kernel.

> A couple of other thoughts:
> 1) Having all newly added memory online ASAP is probably what people
> want for all virtual machines. Unfortunately, we have additional
> complexity with memory zones (ZONE_NORMAL, ZONE_MOVABLE) and in some
> cases manual intervention is required. Especially, when further unplug
> is expected.

and that is why we do not want to hardwire this into the kernel and we
have a notification to handle this in userspace.

> 2) Adding new memory can (in some extreme cases) still fail as we need
> some *other* memory before we're able to online the newly added
> block. This is an issue to be solved and it is doable (IMO) with some
> pre-allocation.

you cannot preallocate for all the possible memory that can be added.
--
Michal Hocko
SUSE Labs