Re: [PATCH v7 03/19] gup: Turn fault_in_pages_{readable,writeable} into fault_in_{readable,writeable}
From: Andreas Gruenbacher
Date: Tue Sep 28 2021 - 16:41:58 EST
Hi Willy,
On Tue, Sep 28, 2021 at 6:40 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
> On Tue, Sep 28, 2021 at 05:02:43PM +0200, Andreas Gruenbacher wrote:
> > On Fri, Sep 3, 2021 at 4:57 PM Filipe Manana <fdmanana@xxxxxxxxx> wrote:
> > > On Fri, Aug 27, 2021 at 5:52 PM Andreas Gruenbacher <agruenba@xxxxxxxxxx> wrote:
> > > > +size_t fault_in_writeable(char __user *uaddr, size_t size)
> > > > +{
> > > > + char __user *start = uaddr, *end;
> > > > +
> > > > + if (unlikely(size == 0))
> > > > + return 0;
> > > > + if (!PAGE_ALIGNED(uaddr)) {
> > > > + if (unlikely(__put_user(0, uaddr) != 0))
> > > > + return size;
> > > > + uaddr = (char __user *)PAGE_ALIGN((unsigned long)uaddr);
> > > > + }
> > > > + end = (char __user *)PAGE_ALIGN((unsigned long)start + size);
> > > > + if (unlikely(end < start))
> > > > + end = NULL;
> > > > + while (uaddr != end) {
> > > > + if (unlikely(__put_user(0, uaddr) != 0))
> > > > + goto out;
> > > > + uaddr += PAGE_SIZE;
> > >
> > > Won't we loop endlessly or corrupt some unwanted page when 'end' was
> > > set to NULL?
> >
> > What do you mean? We set 'end' to NULL when start + size < start
> > exactly so that the loop will stop when uaddr wraps around.
>
> But think about x86-64. The virtual address space (unless you have 5
> level PTs) looks like:
>
> [0, 2^47) userspace
> [2^47, 2^64 - 2^47) hole
> [2^64 - 2^47, 2^64) kernel space
>
> If we try to copy from the hole we'll get some kind of fault (I forget
> the details). We have to stop at the top of userspace.
If you look at the before and after state of this patch,
fault_in_pages_readable and fault_in_pages_writeable did fail an
attempt to fault in a range that wraps with -EFAULT. That's sensible
for a function that returns an all-or-nothing result. We now want to
return how much of the range was (or wasn't) faulted in. We could do
that and still reject ranges that wrap outright. Or we could try to
fault in however much we reasonably can even if the range wraps. The
patch tries the latter, which is where the stopping at NULL is coming
from: when the range wraps, we *definitely* don't want to go any
further.
If the range extends into the hole, we'll get a failure from
__get_user or __put_user where that happens. That's entirely the
expected result, isn't it?
Thanks,
Andreas