Re: [PATCH] drivers/base: use a worker for sysfs unbind

From: Rafael J. Wysocki
Date: Thu Dec 13 2018 - 05:23:56 EST


On Thu, Dec 13, 2018 at 10:58 AM Daniel Vetter <daniel@xxxxxxxx> wrote:
>
> On Thu, Dec 13, 2018 at 10:38:14AM +0100, Rafael J. Wysocki wrote:
> > On Mon, Dec 10, 2018 at 9:47 AM Daniel Vetter <daniel.vetter@xxxxxxxx> wrote:
> > >
> > > Drivers might want to remove some sysfs files, which needs the same
> > > locks and ends up angering lockdep. Relevant snippet of the stack
> > > trace:
> > >
> > > kernfs_remove_by_name_ns+0x3b/0x80
> > > bus_remove_driver+0x92/0xa0
> > > acpi_video_unregister+0x24/0x40
> > > i915_driver_unload+0x42/0x130 [i915]
> > > i915_pci_remove+0x19/0x30 [i915]
> > > pci_device_remove+0x36/0xb0
> > > device_release_driver_internal+0x185/0x250
> > > unbind_store+0xaf/0x180
> > > kernfs_fop_write+0x104/0x190
> >
> > Is the acpi_bus_unregister_driver() in acpi_video_unregister() the
> > source of the lockdep unhappiness?
>
> Yeah I guess I cut out too much of the lockdep splat. It complains about
> kernfs_fop_write and kernfs_remove_by_name_ns acquiring the same lock
> class. It's ofc not the same lock, so no real deadlock. Getting the
> device_release_driver outside of the callchain under kernfs_fop_write,
> which this patch does, "fixes" it. For "fixes" = shut up lockdep.

OK, so the problem really is that the operation is started via sysfs
which means that this code is running under a lock already.

Which lock does lockdep complain about, exactly?

> Other options:
> - Anotate the recursion with the usual lockdep annotations. Potentially
> results in lockdep not catching real deadlocks (you can still have other
> loops closing the deadlock, maybe through some subsystem/bus lock).
>
> - Rewrite kernfs_fop_write to drop the lock (optionally, for callbacks
> that know what they're doing), which should be fine if we refcount
> everything properly (bus, driver & device).
>
> - Also note that probably the same bug exists on the bind sysfs interface,
> but we don't use that, so I don't care :-)
>
> - Most of these issues are never visible in normal usage, since normally
> driver bind/unbind is done from a kthread or model_load/unload, neither
> of which is running in the context of that kernfs mutex kernfs_fop_write
> holds. That's why I think the task work is the best solution, since it
> changes the locking context of the unbind sysfs to match the locking
> context of module unload and hotunplug.

I think that using a task work here makes sense. There is a drawback,
which is that the original sysfs write will not wait for the driver to
actually be released before returning to user space AFAICS, but that
probably isn't a big deal.

Also please note that the patch changes the code flow slightly,
because passing a non-NULL parent pointer to
device_release_driver_internal() potentially has side effects, but
that should not be a big deal either.

> Unfortunately that trick doesn't work for the bind sysfs file, since that way we can't thread the errno value back to userspace.

Right. That is unless we wait for the operation to complete and check
the error left behind by it. That should be doable, but somewhat
complicated.