Re: [PATCH V11 1/5] mm/hotplug: Introduce arch callback validating the hot remove range
From: Anshuman Khandual
Date: Mon Jan 13 2020 - 04:10:16 EST
On 01/10/2020 02:12 PM, David Hildenbrand wrote:
> On 10.01.20 04:09, Anshuman Khandual wrote:
>> Currently there are two interfaces to initiate memory range hot removal i.e
>> remove_memory() and __remove_memory() which then calls try_remove_memory().
>> Platform gets called with arch_remove_memory() to tear down required kernel
>> page tables and other arch specific procedures. But there are platforms
>> like arm64 which might want to prevent removal of certain specific memory
>> ranges irrespective of their present usage or movability properties.
>
> Why? Is this only relevant for boot memory? I hope so, otherwise the
> arch code needs fixing IMHO.
Right, it is relevant only for the boot memory on arm64 platform. But this
new arch callback makes it flexible to reject any given memory range.
>
> If it's only boot memory, we should disallow offlining instead via a
> memory notifier - much cleaner.
Dont have much detail understanding of MMU notifier mechanism but from some
initial reading, it seems like we need to have a mm_struct for a notifier
to monitor various events on the page table. Just wondering how a physical
memory range like boot memory can be monitored because it can be used both
for for kernel (init_mm) or user space process at same time. Is there some
mechanism we could do this ?
>
>>
>> Current arch call back arch_remove_memory() is too late in the process to
>> abort memory hot removal as memory block devices and firmware memory map
>> entries would have already been removed. Platforms should be able to abort
>> the process before taking the mem_hotplug_lock with mem_hotplug_begin().
>> This essentially requires a new arch callback for memory range validation.
>
> I somewhat dislike this very much. Memory removal should never fail if
> used sanely. See e.g., __remove_memory(), it will BUG() whenever
> something like that would strike.
>
>>
>> This differentiates memory range validation between memory hot add and hot
>> remove paths before carving out a new helper check_hotremove_memory_range()
>> which incorporates a new arch callback. This call back provides platforms
>> an opportunity to refuse memory removal at the very onset. In future the
>> same principle can be extended for memory hot add path if required.
>>
>> Platforms can choose to override this callback in order to reject specific
>> memory ranges from removal or can just fallback to a default implementation
>> which allows removal of all memory ranges.
>
> I suspect we want really want to disallow offlining instead. E.g., I
If boot memory pages can be prevented from being offlined for sure, then it
would indirectly definitely prevent hot remove process as well.
> remember s390x does that with certain areas needed for dumping/kexec.
Could not find any references to mmu_notifier in arch/s390 or any other arch
for that matter apart from KVM (which has an user space component), could you
please give some pointers ?
>
> Somebody who added memory via add_memory() should always be able to
> remove the memory via remove_memory() again. Only boot memory can be
> treated in a special way, but boot memory is initially always online.
>