Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()
From: David Hildenbrand
Date: Thu Apr 09 2020 - 03:26:37 EST
On 09.04.20 04:59, piliu wrote:
>
>
> On 04/08/2020 10:46 AM, Baoquan He wrote:
>> Add Pingfan to CC since he usually handles ppc related bugs for RHEL.
>>
>> On 04/07/20 at 03:54pm, David Hildenbrand wrote:
>>> In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory
>>> blocks as removable"), the user space interface to compute whether a memory
>>> block can be offlined (exposed via
>>> /sys/devices/system/memory/memoryX/removable) has effectively been
>>> deprecated. We want to remove the leftovers of the kernel implementation.
>>
>> Pingfan, can you have a look at this change on PPC? Please feel free to
>> give comments if any concern, or offer ack if it's OK to you.
>>
>>>
>>> When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
>>> we'll start by:
>>> 1. Testing if it contains any holes, and reject if so
>>> 2. Testing if pages belong to different zones, and reject if so
>>> 3. Isolating the page range, checking if it contains any unmovable pages
>>>
>>> Using is_mem_section_removable() before trying to offline is not only racy,
>>> it can easily result in false positives/negatives. Let's stop manually
>>> checking is_mem_section_removable(), and let device_offline() handle it
>>> completely instead. We can remove the racy is_mem_section_removable()
>>> implementation next.
>>>
>>> We now take more locks (e.g., memory hotplug lock when offlining and the
>>> zone lock when isolating), but maybe we should optimize that
>>> implementation instead if this ever becomes a real problem (after all,
>>> memory unplug is already an expensive operation). We started using
>>> is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
>>> Implement memory hotplug remove in the kernel"), with the initial
>>> hotremove support of lmbs.
>>>
>>> Cc: Nathan Fontenot <nfont@xxxxxxxxxxxxxxxxxx>
>>> Cc: Michael Ellerman <mpe@xxxxxxxxxxxxxx>
>>> Cc: Benjamin Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx>
>>> Cc: Paul Mackerras <paulus@xxxxxxxxx>
>>> Cc: Michal Hocko <mhocko@xxxxxxxx>
>>> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
>>> Cc: Oscar Salvador <osalvador@xxxxxxx>
>>> Cc: Baoquan He <bhe@xxxxxxxxxx>
>>> Cc: Wei Yang <richard.weiyang@xxxxxxxxx>
>>> Signed-off-by: David Hildenbrand <david@xxxxxxxxxx>
>>> ---
>>> .../platforms/pseries/hotplug-memory.c | 26 +++----------------
>>> 1 file changed, 3 insertions(+), 23 deletions(-)
>>>
>>> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
>>> index b2cde1732301..5ace2f9a277e 100644
>>> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
>>> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
>>> @@ -337,39 +337,19 @@ static int pseries_remove_mem_node(struct device_node *np)
>>>
>>> static bool lmb_is_removable(struct drmem_lmb *lmb)
>>> {
>>> - int i, scns_per_block;
>>> - bool rc = true;
>>> - unsigned long pfn, block_sz;
>>> - u64 phys_addr;
>>> -
>>> if (!(lmb->flags & DRCONF_MEM_ASSIGNED))
>>> return false;
>>>
>>> - block_sz = memory_block_size_bytes();
>>> - scns_per_block = block_sz / MIN_MEMORY_BLOCK_SIZE;
>>> - phys_addr = lmb->base_addr;
>>> -
>>> #ifdef CONFIG_FA_DUMP
>>> /*
>>> * Don't hot-remove memory that falls in fadump boot memory area
>>> * and memory that is reserved for capturing old kernel memory.
>>> */
>>> - if (is_fadump_memory_area(phys_addr, block_sz))
>>> + if (is_fadump_memory_area(lmb->base_addr, memory_block_size_bytes()))
>>> return false;
>>> #endif
>>> -
>>> - for (i = 0; i < scns_per_block; i++) {
>>> - pfn = PFN_DOWN(phys_addr);
>>> - if (!pfn_in_present_section(pfn)) {
>>> - phys_addr += MIN_MEMORY_BLOCK_SIZE;
>>> - continue;
>>> - }
>>> -
>>> - rc = rc && is_mem_section_removable(pfn, PAGES_PER_SECTION);
>>> - phys_addr += MIN_MEMORY_BLOCK_SIZE;
>>> - }
>>> -
>>> - return rc;
>>> + /* device_offline() will determine if we can actually remove this lmb */
>>> + return true;
> So I think here swaps the check and do sequence. At least it breaks
> dlpar_memory_remove_by_count(). It is doable to remove
> is_mem_section_removable(), but here should be more effort to re-arrange
> the code.
>
Thanks Pingfan,
1. "swaps the check and do sequence":
Partially. Any caller of dlpar_remove_lmb() already has to deal with
false positives. device_offline() can easily fail after
dlpar_remove_lmb() == true. It's inherently racy.
2. "breaks dlpar_memory_remove_by_count()"
Can you elaborate why it "breaks" it? It will simply try to
offline+remove lmbs, detect that it wasn't able to offline+remove as
much as it wanted (which could happen before as well easily), and re-add
the already offlined+removed ones.
3. "more effort to re-arrange the code"
What would be your suggestion?
We would rip out that racy check if we can remove as much memory as
requested in dlpar_memory_remove_by_count() and simply always try to
remove + recover.
--
Thanks,
David / dhildenb