Re: [PATCH RFC v2 1/2] mm: Don't skip swap entry even if zap_details specified
From: Peter Xu
Date: Fri Jan 21 2022 - 00:19:47 EST
On Fri, Jan 21, 2022 at 11:11:42AM +0800, Peter Xu wrote:
> On Thu, Jan 20, 2022 at 06:32:29PM +0800, Peter Xu wrote:
> > > Except that here we have no page to check, so it looks like you'll
> > > have to change should_zap_page() to deal with this case too, or just
> > > check details->check_mapping directly.
> >
> > Yeah I prefer this, as we don't have the page* pointer anyway.
> >
> > > Which raises the question again
> > > of why I did not just use a boolean flag there originally: aah, I think
> > > I've found why. In those days there was a horrible "optimization", for
> > > better performance on some benchmark I guess, which when you read from
> > > /dev/zero into a private mapping, would map the zero page there (look
> > > up read_zero_pagealigned() and zeromap_page_range() if you dare). So
> > > there was another category of page to be skipped along with the anon
> > > COWs, and I didn't want multiple tests in the zap loop, so checking
> > > check_mapping against page->mapping did both. I think nowadays you
> > > could do it by checking for PageAnon page (or genuine swap entry)
> > > instead.
> >
> > It must be PageAnon already, isn't it?
>
> I think I see what you meant now..
>
> I assume the special case is gone, how about I switch zap_mappings back into
> a boolean altogether in this patchset? Thanks,
Oh, one more thing..
When reading the history and also your explanations above, I figured one thing
that may not be right for a long time, on zero page handling of zapping.
If to quote your comment above, we should keep the zero page entries too if
zap_details.zap_mapping is specified. However it's not true for a long time, I
guess starting from when vm_normal_page() returns NULL for zero pfns. I also
have a very strong feeling that in the old world the "page*" is non-NULL for
zero pages here.
So... I'm boldly thinking whether we should also need another fix upon the zero
page handling here too, probably before this whole patchset (so it'll be the
1st patch, it should directly apply to master) because if I'm right on above it
can be seen as another separate bug fix:
---8<---
diff --git a/mm/memory.c b/mm/memory.c
index f306e698a1e3..9b8348746e0b 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1320,11 +1320,18 @@ struct zap_details {
static inline bool
zap_skip_check_mapping(struct zap_details *details, struct page *page)
{
- if (!details || !page)
+ /* No detail pointer or no zap_mapping pointer, zap all */
+ if (!details || !details->zap_mapping)
return false;
- return details->zap_mapping &&
- (details->zap_mapping != page_rmapping(page));
+ /*
+ * For example, the zero page. If the user wants to keep the private
+ * pages, zero page should also be in count.
+ */
+ if (!page)
+ return true;
+
+ return details->zap_mapping != page_rmapping(page);
}
static unsigned long zap_pte_range(struct mmu_gather *tlb,
---8<---
page can be NULL for e.g. PFNMAP and when error occured too above. I assume we
don't need to bother with them (e.g., I don't think PFNMAP or MIXED should
specify even_cows=false at all, because there's no page cache layer), though.
Mostly it's about how we should handle zero page right.
Thanks,
--
Peter Xu