Re: Problem with WARN_ON in mutex_trylock() and rxrpc
From: Peter Zijlstra
Date: Thu Dec 05 2019 - 08:22:26 EST
On Thu, Dec 05, 2019 at 12:02:24PM +0000, David Howells wrote:
> Hi Davidlohr,
>
> commit a0855d24fc22d49cdc25664fb224caee16998683 ("locking/mutex: Complain upon
> mutex API misuse in IRQ contexts") is a bit of a problem for rxrpc, though
> nothing that shouldn't be reasonably easy to solve, I think.
>
> What happens is that rxrpc_new_incoming_call(), which is called in softirq
> context, calls mutex_trylock() to prelock a new incoming call:
>
> /* Lock the call to prevent rxrpc_kernel_send/recv_data() and
> * sendmsg()/recvmsg() inconveniently stealing the mutex once the
> * notification is generated.
> *
> * The BUG should never happen because the kernel should be well
> * behaved enough not to access the call before the first notification
> * event and userspace is prevented from doing so until the state is
> * appropriate.
> */
> if (!mutex_trylock(&call->user_mutex))
> BUG();
>
> before publishing it. This used to work fine, but now there are big splashy
> warnings every time a new call comes in.
>
> No one else can see the lock at this point, but I need to lock it so that
> lockdep doesn't complain later. However, I can't lock it in the preallocator
> - again because that upsets lockdep.
To recap the IRC discussion; the intended mutex semantics are such to
allow Priority Inheritance. This means that the mutex must be locked and
unlocked in the same (task) context. Otherwise there is no distinct
owner to boost for contending mutex_lock() operations.
Since (soft)irq context doesn't (necessarily) have a task context, these
operations don't strictly make sense, and that is what the patch in
question tries to WARN about.
As it happens, you do mutex_unlock() from the very same softirq context
you do that mutex_trylock() in, so lockdep will never have had cause to
complain, 'current' is the same at acquire and release.
Now, either we're in non-preemptible softirq context and a contending
mutex_lock() would spuriously boost a random task, which is harmless due
to the non-preemptive nature of softirq, or we're running in ksoftirqd
and that gets boosted, which actually makes some sense.
For PREEMPT_RT (the only case that really matters, since that actually
replaces struct mutex with rt_mutex) this would result in boosting
whatever (soft)irq thread ended up running the thing.
(Also, I'm not entire sure on the current softirq model for -RT)
Is this something we want to allow?
At the very least I'm going to do a lockdep patch that verifies the lock
stack is 'empty' for the current irq_context when it changes.