Re: [PATCH 0/2] arm64/hotplug: Process MEM_OFFLINE and MEM_CANCEL_OFFLINE events

From: Michal Hocko
Date: Wed Apr 15 2020 - 06:44:34 EST


On Wed 15-04-20 09:35:33, David Hildenbrand wrote:
> On 15.04.20 08:39, Anshuman Khandual wrote:
> > This series improves arm64 memory event notifier (hot remove) robustness by
> > enabling it to detect potential problems (if any) in the future. But first
> > it enumerates memory isolation failure reasons that can be sent across a
> > notifier. This series does not go beyond arm64 to explore if these failure
> > reason codes could be used in other existing registered memory notifiers.
> > Please do let me know if there is any other potential use cases, will be
> > happy to incorporate next time around. Also should we add similar failure
> > reasons for online_pages() as well ?
> >
> > This series has been tested on arm64, boot tested on x86 and build tested
> > on multiple other platforms.
> >
>
> I'm sorry, but I have to nack this series. Why?
>
> 1. A hotplug notifier should not have to bother why offlining failed. He
> received a MEM_GOING_OFFLINE, followed by a MEM_CANCEL_OFFLINE. That's
> all he really has to know. Undo what you've done, end of story.
>
> 2. Patch 2 just introduces dead code that should never happen unless
> something is horribly broken in the core (memory offlined although
> nacked from notifier). And, it (for *whatever reason*) thinks it's okay
> to bail out if another noYtifier canceled offlining hotplugged memory.
>
> I fail to see the benefit for core changes and

Agreed! If arm64 wants to check and report early bootmem memory
offlining then just do it. There is no reason to add a whole machinery
for that. Cancel notifier is indeed only supposed to restore the state
before GOING_OFFLINE.

> 4 files changed, 99 insertions(+), 13 deletions(-)

--
Michal Hocko
SUSE Labs