Re: [PATCH 2/3] mm/sparsemem: get physical address to page struct instead of virtual address to pfn

From: David Hildenbrand
Date: Sun Feb 09 2020 - 09:14:38 EST




> Am 09.02.2020 um 14:50 schrieb Baoquan He <bhe@xxxxxxxxxx>:
>
> ïOn 02/07/20 at 11:26am, Wei Yang wrote:
>>> On Thu, Feb 06, 2020 at 06:19:46PM -0800, Dan Williams wrote:
>>> On Thu, Feb 6, 2020 at 3:17 PM Wei Yang <richardw.yang@xxxxxxxxxxxxxxx> wrote:
>>>>
>>>> memmap should be the physical address to page struct instead of virtual
>>>> address to pfn.
>>>>
>>>> Since we call this only for SPARSEMEM_VMEMMAP, pfn_to_page() is valid at
>>>> this point.
>>>>
>>>> Fixes: ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug")
>>>> Signed-off-by: Wei Yang <richardw.yang@xxxxxxxxxxxxxxx>
>>>> CC: Dan Williams <dan.j.williams@xxxxxxxxx>
>>>> ---
>>>> mm/sparse.c | 2 +-
>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/mm/sparse.c b/mm/sparse.c
>>>> index b5da121bdd6e..56816f653588 100644
>>>> --- a/mm/sparse.c
>>>> +++ b/mm/sparse.c
>>>> @@ -888,7 +888,7 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn,
>>>> /* Align memmap to section boundary in the subsection case */
>>>> if (IS_ENABLED(CONFIG_SPARSEMEM_VMEMMAP) &&
>>>> section_nr_to_pfn(section_nr) != start_pfn)
>>>> - memmap = pfn_to_kaddr(section_nr_to_pfn(section_nr));
>>>> + memmap = pfn_to_page(section_nr_to_pfn(section_nr));
>>>
>>> Yes, this looks obviously correct. This might be tripping up
>>> makedumpfile. Do you see any practical effects of this bug? The kernel
>>> mostly avoids ->section_mem_map in the vmemmap case and in the
>>> !vmemmap case section_nr_to_pfn(section_nr) should always equal
>>> start_pfn.
>>
>> I took another look into the code. Looks there is no practical effect after
>> this. Because in the vmemmap case, we don't need ->section_mem_map to retrieve
>> the real memmap.
>>
>> But leave a inconsistent data in section_mem_map is not a good practice.
>
> Yeah, it does has no pratical effect. I tried to create sub-section
> alighed namespace, then trigger crash, makedumpfile isn't impacted.
> Because pmem memory is only added, but not onlined. We don't report it
> to kdump, makedumpfile will ignore it.
>
> I think it's worth fixing it to encode a correct memmap address. We
> don't know if in the future this will break anything.

We can have system memory and devmem overlap within a section (which is still buggy and to be fixed in other regard - e.g., pfn_to_online_page() does not work correctly).

E.g., 64 mb of (boot) system memory in a section. Then you can hot-add devmem that spans the remaining 64 mb of that section.

So some of that memory will be kdumped - and should be fixed if broken.

Cheers


>
> Thanks
> Baoquan