Re: [RFC] mm/gup.c: Updated return value of {get|pin}_user_pages_fast()

From: Souptick Joarder
Date: Thu May 07 2020 - 06:32:31 EST


On Thu, May 7, 2020 at 3:43 PM Jan Kara <jack@xxxxxxx> wrote:
>
> On Wed 06-05-20 21:38:40, Souptick Joarder wrote:
> > On Wed, May 6, 2020 at 6:29 PM Jan Kara <jack@xxxxxxx> wrote:
> > >
> > > On Wed 06-05-20 17:51:39, Souptick Joarder wrote:
> > > > On Wed, May 6, 2020 at 3:36 PM Jan Kara <jack@xxxxxxx> wrote:
> > > > >
> > > > > On Wed 06-05-20 02:06:56, Souptick Joarder wrote:
> > > > > > On Wed, May 6, 2020 at 1:08 AM John Hubbard <jhubbard@xxxxxxxxxx> wrote:
> > > > > > >
> > > > > > > On 2020-05-05 12:14, Souptick Joarder wrote:
> > > > > > > > Currently {get|pin}_user_pages_fast() have 3 return value 0, -errno
> > > > > > > > and no of pinned pages. The only case where these two functions will
> > > > > > > > return 0, is for nr_pages <= 0, which doesn't find a valid use case.
> > > > > > > > But if at all any, then a -ERRNO will be returned instead of 0, which
> > > > > > > > means {get|pin}_user_pages_fast() will have 2 return values -errno &
> > > > > > > > no of pinned pages.
> > > > > > > >
> > > > > > > > Update all the callers which deals with return value 0 accordingly.
> > > > > > >
> > > > > > > Hmmm, seems a little shaky. In order to do this safely, I'd recommend
> > > > > > > first changing gup_fast/pup_fast so so that they return -EINVAL if
> > > > > > > the caller specified nr_pages==0, and of course auditing all callers,
> > > > > > > to ensure that this won't cause problems.
> > > > > >
> > > > > > While auditing it was figured out, there are 5 callers which cares for
> > > > > > return value
> > > > > > 0 of gup_fast/pup_fast. What problem it might cause if we change
> > > > > > gup_fast/pup_fast
> > > > > > to return -EINVAL and update all the callers in a single commit ?
> > > > >
> > > > > Well, first I'd ask a different question: Why do you want to change the
> > > > > current behavior? It's not like the current behavior is confusing. Callers
> > > > > that pass >0 pages can happily rely on the simple behavior of < 0 return on
> > > > > error or > 0 return if we mapped some pages. Callers that can possibly ask
> > > > > to map 0 pages can get 0 pages back - kind of expected - and I don't see
> > > > > any benefit in trying to rewrite these callers to handle -EINVAL instead...
> > > >
> > > > Callers with a request to map 0 pages doesn't have a valid use case. But if any
> > > > caller end up doing it mistakenly, -errno should be returned to caller
> > > > rather than 0
> > > > which will indicate more precisely that map 0 pages is not a valid
> > > > request from caller.
> > >
> > > Well, I believe this depends on the point of view. Similarly as reading 0
> > > bytes is successful, we could consider mapping 0 pages successful as well.
> > > And there can be valid cases where number of pages to map is computed from
> > > some input and when 0 pages should be mapped, it is not a problem and your
> > > change would force such callers to special case this with explicitely
> > > checking for 0 pages to map and not calling GUP in that case at all.
> > >
> > > I'm not saying what you propose is necessarily bad, I just say I don't find
> > > it any better than the current behavior and so IMO it's not worth the
> > > churn. Now if you can come up with some examples of current in-kernel users
> > > who indeed do get the handling of the return value wrong, I could be
> > > convinced otherwise.
> >
> > There are 5 callers of {get|pin}_user_pages_fast().
>
> Oh, there are *much* more callers that 5. It's more like 70. Just grep the
> source... And then you have all other {get|pin}_user_pages() variants that
> need to be kept consistent. So overall we have over 200 calls to some
> variant of GUP.

Sorry, I mean, there are 5 callers of {get|pin}_user_pages_fast() who
have interest in
return value 0, out of total 42.

>
> > arch/ia64/kernel/err_inject.c#L145
> > staging/gasket/gasket_page_table.c#L489
> >
> > Checking return value 0 doesn't make sense for above 2.
> >
> > drivers/platform/goldfish/goldfish_pipe.c#L277
> > net/rds/rdma.c#L165
> > drivers/tee/tee_shm.c#L262
> >
> > These 3 callers have calculated the no of pages value before passing it to
> > {get|pin}_user_pages_fast(). But if they end up passing nr_pages <= 0, a return
> > value of either 0 or -EINVAL doesn't going to harm any existing
> > behavior of callers.
> >
> > IMO, it is safe to return -errno for nr_pages <= 0, for
> > {get|pin}_user_pages_fast().
>
> OK, so no real problem with any of these callers. I still don't see a
> justification for the churn you suggest... Auditting all those code sites
> is going to be pretty tedious.

I try to audit all 42 callers of {get|pin}_user_pages_fast() and
figure out these 5 callers
which need to be updated and I think, other callers of
{get|pin}_user_pages_fast() will not be
effected.

But I didn't go through other variants of gup/pup except
{get|pin}_user_pages_fast().