Re: [PATCH RFC v2 1/2] mm: Don't skip swap entry even if zap_details specified

From: Hugh Dickins
Date: Mon Jan 24 2022 - 01:52:13 EST


On Fri, 21 Jan 2022, Peter Xu wrote:
>
> Oh, one more thing..
>
> When reading the history and also your explanations above, I figured one thing
> that may not be right for a long time, on zero page handling of zapping.
>
> If to quote your comment above, we should keep the zero page entries too if
> zap_details.zap_mapping is specified. However it's not true for a long time, I
> guess starting from when vm_normal_page() returns NULL for zero pfns. I also
> have a very strong feeling that in the old world the "page*" is non-NULL for
> zero pages here.
>
> So... I'm boldly thinking whether we should also need another fix upon the zero
> page handling here too, probably before this whole patchset (so it'll be the
> 1st patch, it should directly apply to master) because if I'm right on above it
> can be seen as another separate bug fix:
>
> ---8<---
> diff --git a/mm/memory.c b/mm/memory.c
> index f306e698a1e3..9b8348746e0b 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -1320,11 +1320,18 @@ struct zap_details {
> static inline bool
> zap_skip_check_mapping(struct zap_details *details, struct page *page)
> {
> - if (!details || !page)
> + /* No detail pointer or no zap_mapping pointer, zap all */
> + if (!details || !details->zap_mapping)
> return false;
>
> - return details->zap_mapping &&
> - (details->zap_mapping != page_rmapping(page));
> + /*
> + * For example, the zero page. If the user wants to keep the private
> + * pages, zero page should also be in count.
> + */
> + if (!page)
> + return true;
> +
> + return details->zap_mapping != page_rmapping(page);
> }
>
> static unsigned long zap_pte_range(struct mmu_gather *tlb,
> ---8<---
>
> page can be NULL for e.g. PFNMAP and when error occured too above. I assume we
> don't need to bother with them (e.g., I don't think PFNMAP or MIXED should
> specify even_cows=false at all, because there's no page cache layer), though.
> Mostly it's about how we should handle zero page right.

I have not understood the above.

I don't know of any problem that needs fixing with the zero page:
how do you suppose the zero page gets into a truncatable or hole-punchable
mapping? We use it for read faults in anonymous mappings. And I told the
story of how once-upon-a-time it could get inserted into any mapping by
reading from /dev/zero, but that odd case was dropped years ago. And I
am open to (even encouraging) a change to make use of zero page for read
faults of holes in shmem: but that's potential future work, which would
require some changes elsewhere (though perhaps none here: the zero page
could never be used for the result of a COW).

Please explain the zero page problem you hope to fix here.

Hugh