Re: [PATCH 3/5] mm, notifier: Catch sleeping/blocking for !blockable

From: Jason Gunthorpe
Date: Thu Aug 15 2019 - 08:36:02 EST


On Thu, Aug 15, 2019 at 09:02:49AM +0200, Daniel Vetter wrote:
> On Wed, Aug 14, 2019 at 09:00:29PM -0300, Jason Gunthorpe wrote:
> > On Wed, Aug 14, 2019 at 10:20:25PM +0200, Daniel Vetter wrote:
> > > We need to make sure implementations don't cheat and don't have a
> > > possible schedule/blocking point deeply burried where review can't
> > > catch it.
> > >
> > > I'm not sure whether this is the best way to make sure all the
> > > might_sleep() callsites trigger, and it's a bit ugly in the code flow.
> > > But it gets the job done.
> > >
> > > Inspired by an i915 patch series which did exactly that, because the
> > > rules haven't been entirely clear to us.
> >
> > I thought lockdep already was able to detect:
> >
> > spin_lock()
> > might_sleep();
> > spin_unlock()
> >
> > Am I mistaken? If yes, couldn't this patch just inject a dummy lockdep
> > spinlock?
>
> Hm ... assuming I didn't get lost in the maze I think might_sleep (well
> ___might_sleep) doesn't do any lockdep checking at all. And we want
> might_sleep, since that catches a lot more than lockdep.

Don't know how it works, but it sure looks like it does:

This:
spin_lock(&file->uobjects_lock);
down_read(&file->hw_destroy_rwsem);
up_read(&file->hw_destroy_rwsem);
spin_unlock(&file->uobjects_lock);

Causes:

[ 33.324729] BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:1444
[ 33.325599] in_atomic(): 1, irqs_disabled(): 0, pid: 247, name: ibv_devinfo
[ 33.326115] 3 locks held by ibv_devinfo/247:
[ 33.326556] #0: 000000009edf8379 (&uverbs_dev->disassociate_srcu){....}, at: ib_uverbs_open+0xff/0x5f0 [ib_uverbs]
[ 33.327657] #1: 000000005e0eddf1 (&uverbs_dev->lists_mutex){+.+.}, at: ib_uverbs_open+0x16c/0x5f0 [ib_uverbs]
[ 33.328682] #2: 00000000505f509e (&(&file->uobjects_lock)->rlock){+.+.}, at: ib_uverbs_open+0x31a/0x5f0 [ib_uverbs]

And this:

spin_lock(&file->uobjects_lock);
might_sleep();
spin_unlock(&file->uobjects_lock);

Causes:

[ 16.867211] BUG: sleeping function called from invalid context at drivers/infiniband/core/uverbs_main.c:1095
[ 16.867776] in_atomic(): 1, irqs_disabled(): 0, pid: 245, name: ibv_devinfo
[ 16.868098] 3 locks held by ibv_devinfo/245:
[ 16.868383] #0: 000000004c5954ff (&uverbs_dev->disassociate_srcu){....}, at: ib_uverbs_open+0xf8/0x600 [ib_uverbs]
[ 16.868938] #1: 0000000020a6fae2 (&uverbs_dev->lists_mutex){+.+.}, at: ib_uverbs_open+0x16c/0x600 [ib_uverbs]
[ 16.869568] #2: 00000000036e6a97 (&(&file->uobjects_lock)->rlock){+.+.}, at: ib_uverbs_open+0x317/0x600 [ib_uverbs]

I think this is done in some very expensive way, so it probably only
works when lockdep is enabled..

Jason