Re: [EXTERNAL] Re: [RFC PATCH V2 2/10] mm: expose is_mem_section_removable() symbol

From: David Hildenbrand
Date: Mon Jan 13 2020 - 10:01:54 EST


On 13.01.20 15:49, Tianyu Lan wrote:
>> From: David Hildenbrand <david@xxxxxxxxxx>
>> Sent: Friday, January 10, 2020 9:42 PM
>> To: Michal Hocko <mhocko@xxxxxxxxxx>; lantianyu1986@xxxxxxxxx
>> Cc: KY Srinivasan <kys@xxxxxxxxxxxxx>; Haiyang Zhang
>> <haiyangz@xxxxxxxxxxxxx>; Stephen Hemminger <sthemmin@xxxxxxxxxxxxx>;
>> sashal@xxxxxxxxxx; akpm@xxxxxxxxxxxxxxxxxxxx; Michael Kelley
>> <mikelley@xxxxxxxxxxxxx>; Tianyu Lan <Tianyu.Lan@xxxxxxxxxxxxx>; linux-
>> hyperv@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx;
>> vkuznets <vkuznets@xxxxxxxxxx>; eric.devolder@xxxxxxxxxx; vbabka@xxxxxxx;
>> osalvador@xxxxxxx; Pasha Tatashin <Pavel.Tatashin@xxxxxxxxxxxxx>;
>> rppt@xxxxxxxxxxxxx
>> Subject: [EXTERNAL] Re: [RFC PATCH V2 2/10] mm: expose
>> is_mem_section_removable() symbol
>>
>> On 07.01.20 14:36, Michal Hocko wrote:
>>> On Tue 07-01-20 21:09:42, lantianyu1986@xxxxxxxxx wrote:
>>>> From: Tianyu Lan <Tianyu.Lan@xxxxxxxxxxxxx>
>>>>
>>>> Hyper-V balloon driver will use is_mem_section_removable() to check
>>>> whether memory block is removable or not when receive memory hot
>>>> remove msg. Expose it.
>>>
>>> I do not think this is a good idea. The check is inherently racy. Why
>>> cannot the balloon driver simply hotremove the region when asked?
>>>
>>
>> It's not only racy, it also gives no guarantees. False postives and false negatives
>> are possible.
>>
>> If you want to avoid having to loop forever trying to offline when calling
>> offline_and_remove_memory(), you could try to
>> alloc_contig_range() the memory first and then play the PG_offline+notifier
>> game like virtio-mem.
>>
>> I don't remember clearly, but I think pinned pages can make offlining loop for a
>> long time. And I remember there were other scenarios as well (including out of
>> memory conditions and similar).
>>
>> I sent an RFC [1] for powerpc/memtrace that does the same (just error
>> handling is more complicated as it wants to offline and remove multiple
>> consecutive memory blocks) - if you want to try to go down that path.
>>
> Hi David & Michal:
> Thanks for your review. Some memory blocks are not suitable for hot-plug.
> If not check memory block's removable, offline_pages() will report some failure error
> e.g, "failed due to memory holes" and "failure to isolate range". I think the check maybe
> added into offline_and_remove_memory()? This may help to not create/expose a new
> interface to do such check in module.

So it's all about the logging output. Duplicating these checks feels
very wrong. And you will still get plenty of page dumps (read below), so
that won't help.

We have pr_debug() for these "failure ..." message. that should
therefore not be an issue on production systems, right?

However, you will see dump_page()s quite often, which logs via pr_warn().

Of course, we could add a mechanism to temporarily disable logging
output for these call paths, but it might actually be helpful for
debugging. We might just want to convert everything that is not actually
a warning to pr_debug() - especially in dump_page().

--
Thanks,

David / dhildenb