Re: [PATCH] locking: Document that mutex_unlock() is non-atomic

From: Jann Horn
Date: Fri Dec 01 2023 - 13:18:46 EST


On Fri, Dec 1, 2023 at 7:12 PM David Laight <David.Laight@xxxxxxxxxx> wrote:
> From: Jann Horn
> > Sent: 01 December 2023 15:02
> >
> > On Fri, Dec 1, 2023 at 1:33 AM Waiman Long <longman@xxxxxxxxxx> wrote:
> > > On 11/30/23 15:48, Jann Horn wrote:
> > > > I have seen several cases of attempts to use mutex_unlock() to release an
> > > > object such that the object can then be freed by another task.
> > > > My understanding is that this is not safe because mutex_unlock(), in the
> > > > MUTEX_FLAG_WAITERS && !MUTEX_FLAG_HANDOFF case, accesses the mutex
> > > > structure after having marked it as unlocked; so mutex_unlock() requires
> > > > its caller to ensure that the mutex stays alive until mutex_unlock()
> > > > returns.
> > > >
> > > > If MUTEX_FLAG_WAITERS is set and there are real waiters, those waiters
> > > > have to keep the mutex alive, I think; but we could have a spurious
> > > > MUTEX_FLAG_WAITERS left if an interruptible/killable waiter bailed
> > > > between the points where __mutex_unlock_slowpath() did the cmpxchg
> > > > reading the flags and where it acquired the wait_lock.
> > > >
> > > > (With spinlocks, that kind of code pattern is allowed and, from what I
> > > > remember, used in several places in the kernel.)
> > > >
> > > > If my understanding of this is correct, we should probably document this -
> > > > I think such a semantic difference between mutexes and spinlocks is fairly
> > > > unintuitive.
> > >
> > > Spinlocks are fair. So doing a lock/unlock sequence will make sure that
> > > all the previously waiting waiters are done with the lock. Para-virtual
> > > spinlocks, however, can be a bit unfair so doing a lock/unlock sequence
> > > may not be enough to guarantee there is no waiter. The same is true for
> > > mutex. Adding a spin_is_locked() or mutex_is_locked() check can make
> > > sure that all the waiters are gone.
> >
> > I think this pattern anyway only works when you're only trying to wait
> > for the current holder of the lock, not tasks that are queued up on
> > the lock as waiters - so a task initially holds a stable reference to
> > some object, then acquires the object's lock, then drops the original
> > reference, and then later drops the lock.
> > You can see an example of such mutex usage (which is explicitly legal
> > with userspace POSIX mutexes, but is forbidden with kernel mutexes) at
> > the bottom of the POSIX manpage for pthread_mutex_destroy() at
> > <https://pubs.opengroup.org/onlinepubs/007904875/functions/pthread_mutex_destroy.html>,
> > in the section "Destroying Mutexes".
>
> I don't understand at all what any of this is about.
> You cannot de-initialise, free (etc) a mutex (or any other piece of
> memory for that matter) if another thread can have a reference to it.
> If some other code might be holding the mutex it also might be just
> about to acquire it - you always need another lock of some kind to
> ensure that doesn't happen.
>
> IIRC pretty much the only time you need to acquire the mutex in the
> free path is if locks are chained, eg:
> lock(table)
> entry = find_entry();
> lock(entry)
> unlock(table)
> ...
> unlock(entry)
>
> Then the free code has to:
> lock(table)
> remove_from_table(entry)
> lock(entry)
> unlock(entry)
> unlock(table)
> free(entry)

Yep, this is exactly the kind of code pattern for which I'm trying to
document that it is forbidden with mutexes (while it is allowed with
spinlocks).