Re: [PATCH v4 4/4] ioctl_userfaultfd.2: Add write-protect mode docs

From: Peter Xu
Date: Mon Mar 29 2021 - 17:52:41 EST


On Thu, Mar 25, 2021 at 10:32:20PM +0100, Alejandro Colomar (man-pages) wrote:
> Hi Peter,
>
> On 3/23/21 8:16 PM, Peter Xu wrote:
> > On Tue, Mar 23, 2021 at 07:11:04PM +0100, Alejandro Colomar (man-pages) wrote:
> > > > +.TP
> > > > +.B UFFDIO_COPY_MODE_WP
> > > > +Copy the page with read-only permission.
> > > > +This allows the user to trap the next write to the page,
> > > > +which will block and generate another write-protect userfault message.
> > >
> > > s/write-protect/write-protected/
> > > ?
> >
> > I think here "write-protect" is the wording I wanted to use, it is the name of
> > the type of the message in plain text.
>
> Okay.
>
> >
> > [...]
> >
> > > > +.B EAGAIN
> > > > +The process was interrupted and need to retry.
> > >
> > > Maybe: "The process was interrupted; retry this call."?
> > > I don't know what other pager say about this kind of error.
> >
> > Frankly I see no difference between the two.. If you prefer the latter, I can
> > switch.
>
> I understand yours, but technically it's a bit incorrect: The subject of
> the sentence changes: in "The process was interrupted" it's the process, and
> in "need to retry" it's [you]. By separating the sentence into two, it's
> more natural. :)

Sure, I'll change.

>
> >
> > >
> > > > +.TP
> > > > +.B ENOENT
> > > > +The range specified in
> > > > +.I range
> > > > +is not valid.
> > >
> > > I'm not sure how this is different from the wording above in EINVAL. An
> > > "otherwise invalid range" was already giving EINVAL?
> >
> > This can be returned when vma is not found (mwriteprotect_range()):
> >
> > err = -ENOENT;
> > dst_vma = find_dst_vma(dst_mm, start, len);
> >
> > if (!dst_vma)
> > goto out_unlock;
> >
> > I think maybe I could simply remove this entry, because from an user app
> > developer pov I'd only be interested in specific error that I'd be able to
> > detect and (even better) recover from. For such error I'd say there's not much
> > to do besides failing the app.
>
> If there's any possibility that the error can happen, it should be
> documented, even if it's to say "Fatal error; abort!". Just try to explain
> the causes and how to avoid causing them and/or possibly what to do when
> they happen (abort?).

Okay. Would you mind me keeping my original wording? Because IMHO that
exactly does what you said as "trying to explain the causes" and so on:

.B ENOENT
The range specified in
.I range
is not valid.
For example, the virtual address does not exist,
or not registered with userfaultfd write-protect mode.

It's indeed slightly duplicated with EINVAL, but if you don't agree with the
wording meanwhile if you don't agree on overlapping of the errors, then what I
need is not reworking this patchset, but proposing a kernel patch to change the
error retval to make them match. I am not against proposing a kernel patch, but
I just don't see it extremely necessary.

For my own experience on working with the kernel, the return value sometimes is
not that strict - say, it's hard to control every single bit of the possible
return code of a syscall/ioctl to reflect everything matching the document. We
should always try to do it accurate but it seems not easy to me. It's also
hard to write up the document that 100% matching the kernel code, because at
least that'll require a full-path workthrough of every single piece of kernel
code that the syscall/ioctl has called, so as to collect all the errors, then
summarize their meanings. That could be a lot of work.

>
> >
> > >
> > > > +For example, the virtual address does not exist,
> > > > +or not registered with userfaultfd write-protect mode.
> > > > +.TP
> > > > +.B EFAULT
> > > > +Encountered a generic fault during processing.
> > >
> > > What is a "generic fault"?
> >
> > For example when the user copy failed due to some reason. See
> > userfaultfd_writeprotect():
> >
> > if (copy_from_user(&uffdio_wp, user_uffdio_wp,
> > sizeof(struct uffdio_writeprotect)))
> > return -EFAULT;
> >
> > But I didn't check other places, generally I'd return -EFAULT if I can't find a
> > proper other replacement which has a clearer meaning.
> >
> > I don't think this is really helpful to user app too because no user app would
> > start to read this -EFAULT to do anything useful.. how about I drop it too if
> > you think the description is confusing?
>
> Same as above.

Above copy_from_user() is the only place that could trigger -EFAULT so far I
can find. So either I can change above into:

.TP
.B EFAULT
Failure on copying ioctl parameters into the kernel.

Would you think it okay (before I repost)? I'd still prefer my original
wording because I bet 90% user developer may not even know what does it mean
when the kernel cannot copy the user parameter, and what he/she can do with
it.. However if you think it's proper I'll use it.

Thanks,

--
Peter Xu