Re: uninitialized pmem struct pages

From: David Hildenbrand
Date: Tue Jan 05 2021 - 04:18:08 EST


On 05.01.21 08:50, Michal Hocko wrote:
> On Mon 04-01-21 21:17:43, Dan Williams wrote:
>> On Mon, Jan 4, 2021 at 2:45 AM David Hildenbrand <david@xxxxxxxxxx> wrote:
> [...]
>>> I believe Dan mentioned somewhere that he wants to see a real instance
>>> of this producing a BUG before actually moving forward with a fix. I
>>> might be wrong.
>>
>> I think I'm missing an argument for the user-visible effects of the
>> "Bad." statements above. I think soft_offline_page() is a candidate
>> for a local fix because mm/memory-failure.c already has a significant
>> amount of page-type specific knowledge. So teaching it "yes" for
>> MEMORY_DEVICE_PRIVATE-ZONE_DEVICE and "no" for other ZONE_DEVICE seems
>> ok to me.
>
> I believe we do not want to teach _every_ pfn walker about zone device
> pages. This would be quite error prone. Especially when a missig check
> could lead to a silently broken data or BUG_ON with debugging enabled
> (which is not the case for many production users). Or are we talking
> about different bugs here?

I'd like us to stick to the documentation, e.g., include/linux/mmzone.h


"
pfn_valid() is meant to be able to tell if a given PFN has valid memmap
associated with it or not. This means that a struct page exists for this
pfn. The caller cannot assume the page is fully initialized in general.
Hotplugable pages might not have been onlined yet. pfn_to_online_page()
will ensure the struct page is fully online and initialized. Special
pages (e.g. ZONE_DEVICE) are never onlined and should be treated
accordingly.
"

--
Thanks,

David / dhildenb