Re: [PATCH v1] mm: Fix access of uninitialized memmaps in fs/proc/page.c
From: David Hildenbrand
Date: Wed Oct 09 2019 - 08:58:32 EST
On 09.10.19 13:24, Michal Hocko wrote:
> On Wed 09-10-19 12:19:59, David Hildenbrand wrote:
> [...]
>>> pfn_to_online_page makes sense because offline pages are not really in a
>>> defined state. This would be worth a patch of its own. I remember there
>>
>> The issue is, once I check for pfn_to_online_page(), these functions
>> can't handle ZONE_DEVICE at all anymore. Especially in regards to
>> memory_failure() I don't think this is acceptable.
> > Could you be more specific please? I am not sure I am following.
I wasn't quite clear, let me try to be more precise:
if (pfn_to_online_page(pfn)) {
/* memmap initialized */
} else if (pfn_valid(pfn)) {
/* ???
* a) offline memory. memmap garbage.
* b) memremap memory: memmap initialized to ZONE_DEVICE.
* c) memremap memory: reserved for driver. memmap garbage.
* (d) memremap memory: memmap currently initializing - garbage)
*/
}
To distinguish between a) and b/c), we can currently only use
get_dev_pagemap(pfn, NULL). To distinguish between b) and c), we can
currently only use pfn_zone_device_reserved().
That implies, that - right now - if we want to fix what is described in
the patch without introducing more users of get_dev_pagemap(pfn, NULL),
we will not be able to support ZONE_DEVICE in:
- /proc/kpagecount
- /proc/kpageflags
- /proc/kpagecgroup
if (pfn_to_online_page(pfn)) {
/* memmap initialized */
} else {
/* skip */
}
Now, memory_failure() already contains a get_dev_pagemap(pfn, NULL)
check and adding pfn_to_online_page(pfn) would also work.
I would be fine with this, but it means that - for now - the three
/proc/ files won't be able to deal with ZONE_DEVICE memory.
>
>> So while I
>> (personally) only care about adding pfn_to_online_page() checks, the
>> in-this-sense-fragile-subsection ZONE_DEVICE implementation requires me
>> to introduce a temporary check for initialized memmaps.
>>
>>> was a discussion about the uninitialized zone device memmaps. It would
>>> be really good to summarize this discussion in the changelog and
>>> conclude why the explicit check is really good and what were other
>>> alternatives considered.
>>
>> Yeah, I also expressed my feelings and the issues to be solved by
>> ZONE_DEVICE people in https://lkml.org/lkml/2019/9/6/114. However, the
>> discussion stalled and nobody really proposed a solution or followed up.
>
> I will try to get back to that discussion but is there any technical
> reason that prevents any conclusion or it is just stuck on a lack of
> time of the participants?
I think it was both. People not responding to questions and not having
decided on a solution.
--
Thanks,
David / dhildenb