Re: possible deadlock in send_sigio
From: Matthew Wilcox
Date: Mon Jun 15 2020 - 16:41:14 EST
On Mon, Jun 15, 2020 at 01:13:51PM -0400, Waiman Long wrote:
> On 6/15/20 12:49 PM, Matthew Wilcox wrote:
> > On Fri, Jun 12, 2020 at 03:01:01PM +0800, Boqun Feng wrote:
> > > On the archs using QUEUED_RWLOCKS, read_lock() is not always a recursive
> > > read lock, actually it's only recursive if in_interrupt() is true. So
> > > change the annotation accordingly to catch more deadlocks.
> > [...]
> >
> > > +#ifdef CONFIG_LOCKDEP
> > > +/*
> > > + * read_lock() is recursive if:
> > > + * 1. We force lockdep think this way in selftests or
> > > + * 2. The implementation is not queued read/write lock or
> > > + * 3. The locker is at an in_interrupt() context.
> > > + */
> > > +static inline bool read_lock_is_recursive(void)
> > > +{
> > > + return force_read_lock_recursive ||
> > > + !IS_ENABLED(CONFIG_QUEUED_RWLOCKS) ||
> > > + in_interrupt();
> > > +}
> > I'm a bit uncomfortable with having the _lockdep_ definition of whether
> > a read lock is recursive depend on what the _implementation_ is.
> > The locking semantics should be the same, no matter which architecture
> > you're running on. If we rely on read locks being recursive in common
> > code then we have a locking bug on architectures which don't use queued
> > rwlocks.
> >
> > I don't know whether we should just tell the people who aren't using
> > queued rwlocks that they have a new requirement or whether we should
> > say that read locks are never recursive, but having this inconsistency
> > is not a good idea!
>
> Actually, qrwlock is more restrictive. It is possible that systems with
> qrwlock may hit deadlock which doesn't happens in other systems that use
> recursive rwlock. However, the current lockdep code doesn't detect those
> cases.
Oops. I misread. Still, my point stands; we should have the same
definition of how you're allowed to use locks from the lockdep point of
view, even if the underlying implementation won't deadlock on a particular
usage model.
So I'd be happy with:
+ return lockdep_pretend_in_interrupt || in_interrupt();
to allow the test-suite to test that it works as expected, without
actually disabling interrupts while the testsuite runs.