Re: [PATCH v1 1/1] mm/gup: skip pinnable check for refs==1

From: Minchan Kim
Date: Tue Feb 01 2022 - 02:28:34 EST


Hi John,

On Mon, Jan 31, 2022 at 12:49:35PM -0800, John Hubbard wrote:
> On 1/31/22 12:35, Will McVicker wrote:
> > This fixes commit 54d516b1d62f ("mm/gup: small refactoring: simplify
> > try_grab_page()") which refactors try_grab_page() to call
> > try_grab_compound_head() with refs=1. The refactor commit is causing
> > pin_user_pages() to return -ENOMEM when we try to pin one user page that
> > is migratable and not in the movable zone. Previously, try_grab_page()
> > didn't check if the page was pinnable for FOLL_PIN. To match the same
> > functionality, this fix adds the check `refs > 1 &&` to skip the call to
> > is_pinnable_page().
> >
>
> That's a clear write-up of what you're seeing, what caused it, and how
> you'd like to correct it. The previous code had a loophole, and you want
> to keep that loophole. More below...
>
> > This issue is reproducible with the Pixel 6 on the 5.15 LTS kernel. Here
> > is the call stack to reproduce the -ENOMEM error:
> ...
> > Fixes: 54d516b1d62f ("mm/gup: small refactoring: simplify try_grab_page()")
> > Cc: John Hubbard <jhubbard@xxxxxxxxxx>
> > Cc: Minchan Kim <minchan@xxxxxxxxxx>
> > Signed-off-by: Will McVicker <willmcvicker@xxxxxxxxxx>
> > ---
> > mm/gup.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/mm/gup.c b/mm/gup.c
> > index f0af462ac1e2..0509c49c46a3 100644
> > --- a/mm/gup.c
> > +++ b/mm/gup.c
> > @@ -135,7 +135,7 @@ struct page *try_grab_compound_head(struct page *page,
> > * right zone, so fail and let the caller fall back to the slow
> > * path.
> > */
> > - if (unlikely((flags & FOLL_LONGTERM) &&
> > + if (refs > 1 && unlikely((flags & FOLL_LONGTERM) &&
> > !is_pinnable_page(page)))
> > return NULL;
>
> ...but are you really sure that this is the best way to "fix" the
> problem? This trades correctness for "bug-for-bug compatibility" with
> the previous code. It says, "it's OK to violate the pinnable and
> longterm checks, as long as you do it one page at a time, rather than in
> larger chunks.
>
> Wouldn't it be better to try to fix up the calling code so that it's
> not in violation of these zone rules?

I think the problem is before pin_user_pages can work with CMA pages
in the fallback path but now it doesn't work with CMA page. Driver
couldn't know whether it will work with CMA page or not in advance.

pin_user_pages
__get_user_pages_locked
follow_page_mask
follow_page_pte
try_grab_page
!is_pinnable_page(page)
return NULL;
return ERR_PTR(-ENOMEM);
return -ENOMEM without faultin_page