Re: [PATCH 3/3] btrfs: Avoid live-lock in search_ioctl() on hardware with sub-page faults
From: Andreas Gruenbacher
Date: Thu Nov 25 2021 - 16:43:11 EST
On Thu, Nov 25, 2021 at 10:02 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
> On Thu, Nov 25, 2021 at 08:43:57PM +0000, Catalin Marinas wrote:
> > > I really believe that the fix is to make the read/write probing just
> > > be more aggressive.
> > >
> > > Make the read/write probing require that AT LEAST <n> bytes be
> > > readable/writable at the beginning, where 'n' is 'min(len,ALIGN)', and
> > > ALIGN is whatever size that copy_from/to_user_xyz() might require just
> > > because it might do multi-byte accesses.
> > >
> > > In fact, make ALIGN be perhaps something reasonable like 512 bytes or
> > > whatever, and then you know you can handle the btrfs "copy a whole
> > > structure and reset if that fails" case too.
> >
> > IIUC what you are suggesting, we still need changes to the btrfs loop
> > similar to willy's but that should work fine together with a slightly
> > more aggressive fault_in_writable().
> >
> > A probing of at least sizeof(struct btrfs_ioctl_search_key) should
> > suffice without any loop changes and 512 would cover it but it doesn't
> > look generic enough. We could pass a 'probe_prefix' argument to
> > fault_in_exact_writeable() to only probe this and btrfs would just
> > specify the above sizeof().
>
> How about something like this?
>
> +++ b/mm/gup.c
> @@ -1672,6 +1672,13 @@ size_t fault_in_writeable(char __user *uaddr, size_t size)
>
> if (unlikely(size == 0))
> return 0;
> + if (SUBPAGE_PROBE_INTERVAL) {
> + while (uaddr < PAGE_ALIGN((unsigned long)uaddr)) {
> + if (unlikely(__put_user(0, uaddr) != 0))
> + goto out;
> + uaddr += SUBPAGE_PROBE_INTERVAL;
> + }
> + }
> if (!PAGE_ALIGNED(uaddr)) {
> if (unlikely(__put_user(0, uaddr) != 0))
> return size;
>
> ARM then defines SUBPAGE_PROBE_INTERVAL to be 16 and the rest of us
> leave it as 0. That way we probe all the way to the end of the current
> page and the start of the next page.
>
> Oh, that needs to be checked to not exceed size as well ... anyway,
> you get the idea.
Note that we don't need this additional probing when accessing the
buffer with byte granularity. That's probably the common case.
Andreas