Re: get_user_pages returning 0 (was Re: kernel BUG at drivers/vhost/vhost.c:LINE!)
From: David Sterba
Date: Mon Mar 19 2018 - 11:32:11 EST
On Mon, Mar 19, 2018 at 05:09:28PM +0200, Michael S. Tsirkin wrote:
> Hello!
> The following code triggered by syzbot
>
> r = get_user_pages_fast(log, 1, 1, &page);
> if (r < 0)
> return r;
> BUG_ON(r != 1);
>
> Just looking at get_user_pages_fast's documentation this seems
> impossible - it is supposed to only ever return # of pages
> pinned or errno.
>
> However, poking at code, I see at least one path that might cause this:
>
> ret = faultin_page(tsk, vma, start, &foll_flags,
> nonblocking);
> switch (ret) {
> case 0:
> goto retry;
> case -EFAULT:
> case -ENOMEM:
> case -EHWPOISON:
> return i ? i : ret;
> case -EBUSY:
> return i;
>
> which originally comes from:
>
> commit 53a7706d5ed8f1a53ba062b318773160cc476dde
> Author: Michel Lespinasse <walken@xxxxxxxxxx>
> Date: Thu Jan 13 15:46:14 2011 -0800
>
> mlock: do not hold mmap_sem for extended periods of time
>
> __get_user_pages gets a new 'nonblocking' parameter to signal that the
> caller is prepared to re-acquire mmap_sem and retry the operation if
> needed. This is used to split off long operations if they are going to
> block on a disk transfer, or when we detect contention on the mmap_sem.
>
> [akpm@xxxxxxxxxxxxxxxxxxxx: remove ref to rwsem_is_contended()]
> Signed-off-by: Michel Lespinasse <walken@xxxxxxxxxx>
> Cc: Hugh Dickins <hughd@xxxxxxxxxx>
> Cc: Rik van Riel <riel@xxxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: Nick Piggin <npiggin@xxxxxxxxx>
> Cc: KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx>
> Cc: Ingo Molnar <mingo@xxxxxxx>
> Cc: "H. Peter Anvin" <hpa@xxxxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: David Howells <dhowells@xxxxxxxxxx>
> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
>
> I started looking into this, if anyone has any feedback meanwhile,
> that would be appreciated.
>
> In particular I don't really see why would this trigger
> on commit 8f5fd927c3a7576d57248a2d7a0861c3f2795973:
>
> Merge: 8757ae2 093e037
> Author: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> Date: Fri Mar 16 13:37:42 2018 -0700
>
> Merge tag 'for-4.16-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
>
> is btrfs used on these systems?
There were 3 patches pulled by that tag, none of them is even remotely
related to the reported bug, AFAICS. If there's some impact, it must be
indirect, obvious bugs like NULL pointer would exhibit in a different
way and leave at least some trace in the stacks.