Re: [PATCH v7 09/12] sysfs: fix deadlock race with module removal
From: Dan Williams
Date: Mon Sep 20 2021 - 22:05:11 EST
On Mon, Sep 20, 2021 at 2:17 PM Luis Chamberlain <mcgrof@xxxxxxxxxx> wrote:
>
> On Mon, Sep 20, 2021 at 01:52:21PM -0700, Dan Williams wrote:
> > On Fri, Sep 17, 2021 at 10:05 PM Luis Chamberlain <mcgrof@xxxxxxxxxx> wrote:
> > > This deadlock was first reported with the zram driver, however the live
> > > patching folks have acknowledged they have observed this as well with
> > > live patching, when a live patch is removed. I was then able to
> > > reproduce easily by creating a dedicated selftests.
> > >
> > > A sketch of how this can happen follows:
> > >
> > > CPU A CPU B
> > > whatever_store()
> > > module_unload
> > > mutex_lock(foo)
> > > mutex_lock(foo)
> > > del_gendisk(zram->disk);
> > > device_del()
> > > device_remove_groups()
> >
> > This flow seems possible to trigger with:
> >
> > echo $dev > /sys/bus/$bus/drivers/$driver/unbind
> >
> > I am missing why module pinning
>
> The aspect of try_module_get() which comes to value to prevent the
> deadlock is it ensures kernfs ops do not run once exit is on the way.
>
> > is part of the solution when it's the
> > device_del() path that is racing?
>
> But its not, the device_del() path will yield until the kernfs op
> completes. It is fine to wait there.
>
> The deadlock happens if a module exit routine uses a lock which is
> also used on a sysfs op. If the lock was first held by module exit,
> and module exit is waiting for the kernfs op to complete, and the
> kernfs op is waiting to hold the same lock then the exit will wait
> forever.
>
> > Module removal is just a more coarse
> > grained way to trigger unbind => device_del().
>
> Right, but the device_del() path is not sharing a lock with the sysfs op.
The deadlock in the example comes from holding a lock over
device_del() that is also taken in a kernfs op. For example, the code
above looks like something that runs from driver.remove(), not
exclusively module_exit(). Yes, module_exit() may trigger
driver.remove() via driver_unregister(), but there's other ways to
trigger driver.remove() that do not involve module_exit().
> > Isn't the above a bug
> > in the driver, not missing synchronization in kernfs?
>
> We can certainly take the position as an alternative:
>
> "thou shalt not use a lock on exit which is also used on a syfs op"
>
> However that seems counter intuitive, specially if we can resolve the
> issue easily with a try_module_get().
Again, I don't see how try_module_get() can affect the ABBA failure
case of holding a lock over device_del() that is also held inside
sysfs op. I agree that the problem is subtle. Does lockdep not
complain about this case? If it's going to be avoided in the core it
seems try_module_get() does not completely cover the hole that
unsuspecting driver writers might fall into.