Re: [PATCH 10/23] userfaultfd: add new syscall to provide memory externalization

From: Linus Torvalds
Date: Thu May 14 2015 - 13:49:16 EST


On Thu, May 14, 2015 at 10:31 AM, Andrea Arcangeli <aarcange@xxxxxxxxxx> wrote:
> +static __always_inline void wake_userfault(struct userfaultfd_ctx *ctx,
> + struct userfaultfd_wake_range *range)
> +{
> + if (waitqueue_active(&ctx->fault_wqh))
> + __wake_userfault(ctx, range);
> +}

Pretty much every single time people use this "if
(waitqueue_active())" model, it tends to be a bug, because it means
that there is zero serialization with people who are just about to go
to sleep. It's fundamentally racy against all the "wait_event()" loops
that carefully do memory barriers between testing conditions and going
to sleep, because the memory barriers now don't exist on the waking
side.

So I'm making a new rule: if you use waitqueue_active(), I want an
explanation for why it's not racy with the waiter. A big comment about
the memory ordering, or about higher-level locks that are held by the
caller, or something.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/