Re: [PATCH 5/5] mm/thp: Split huge pmds/puds if they're pinned when fork()

From: Peter Xu
Date: Wed Sep 23 2020 - 11:44:26 EST


On Wed, Sep 23, 2020 at 04:01:14PM +0200, Jan Kara wrote:
> On Wed 23-09-20 09:50:04, Peter Xu wrote:
> > On Wed, Sep 23, 2020 at 11:22:05AM +0200, Jan Kara wrote:
> > > On Tue 22-09-20 13:01:13, John Hubbard wrote:
> > > > On 9/22/20 3:33 AM, Jan Kara wrote:
> > > > > On Mon 21-09-20 23:41:16, John Hubbard wrote:
> > > > > > On 9/21/20 2:20 PM, Peter Xu wrote:
> > > > > > ...
> > > > > > > + if (unlikely(READ_ONCE(src_mm->has_pinned) &&
> > > > > > > + page_maybe_dma_pinned(src_page))) {
> > > > > >
> > > > > > This condition would make a good static inline function. It's used in 3
> > > > > > places, and the condition is quite special and worth documenting, and
> > > > > > having a separate function helps with that, because the function name
> > > > > > adds to the story. I'd suggest approximately:
> > > > > >
> > > > > > page_likely_dma_pinned()
> > > > > >
> > > > > > for the name.
> > > > >
> > > > > Well, but we should also capture that this really only works for anonymous
> > > > > pages. For file pages mm->has_pinned does not work because the page may be
> > > > > still pinned by completely unrelated process as Jann already properly
> > > > > pointed out earlier in the thread. So maybe anon_page_likely_pinned()?
> > > > > Possibly also assert PageAnon(page) in it if we want to be paranoid...
> > > > >
> > > > > Honza
> > > >
> > > > The file-backed case doesn't really change anything, though:
> > > > page_maybe_dma_pinned() is already a "fuzzy yes" in the same sense: you
> > > > can get a false positive. Just like here, with an mm->has_pinned that
> > > > could be a false positive for a process.
> > > >
> > > > And for that reason, I'm also not sure an "assert PageAnon(page)" is
> > > > desirable. That assertion would prevent file-backed callers from being
> > > > able to call a function that provides a fuzzy answer, but I don't see
> > > > why you'd want or need to do that. The goal here is to make the fuzzy
> > > > answer a little bit more definite, but it's not "broken" just because
> > > > the result is still fuzzy, right?
> > > >
> > > > Apologies if I'm missing a huge point here... :)
> > >
> > > But the problem is that if you apply mm->has_pinned check on file pages,
> > > you can get false negatives now. And that's not acceptable...
> >
> > Do you mean the case where proc A pinned page P from a file, then proc B
> > mapped the same page P on the file, then fork() on proc B?
>
> Yes.
>
> > If proc B didn't explicitly pinned page P in B's address space too,
> > shouldn't we return "false" for page_likely_dma_pinned(P)? Because if
> > proc B didn't pin the page in its own address space, I'd think it's ok to
> > get the page replaced at any time as long as the content keeps the same.
> > Or couldn't we?
>
> So it depends on the reason why you call page_likely_dma_pinned(). For your
> COW purposes the check is correct but e.g. for "can filesystem safely
> writeback this page" the page_likely_dma_pinned() would be wrong. So I'm
> not objecting to the mechanism as such. I'm mainly objecting to the generic
> function name which suggests something else than what it really checks and
> thus it could be used in wrong places in the future... That's why I'd
> prefer to restrict the function to PageAnon pages where there's no risk of
> confusion what the check actually does.

How about I introduce the helper as John suggested, but rename it to

page_maybe_dma_pinned_by_mm()

?

Then we also don't need to judge on which is more likely to happen (between
"maybe" and "likely", since that will confuse me if I only read these words..).

I didn't use any extra suffix like "cow" because I think it might be useful for
things besides cow. Fundamentally the new helper will be mm-based, so "by_mm"
seems to suite better to me.

Does that sound ok?

--
Peter Xu