Re: [PATCH 2/2] drivers: core: Remove glue dirs from sysfs earlier
From: Linus Torvalds
Date: Sat Jun 30 2018 - 22:08:02 EST
[ Ben - feel free to post the missing emails to lkml too ]
On Sat, Jun 30, 2018 at 6:56 PM Benjamin Herrenschmidt
<benh@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On Sat, 2018-06-30 at 12:29 -0700, Linus Torvalds wrote:
> >
> > Because normally, the last kobject_put() really does do a synchronous
> > kobject_del(). So normally, this is all completely pointless, and the
> > normal kobject lifetime rules should be sufficient.
>
> Not entirely sadly....
>
> There's a small race between the kref going down to 0 and the last
> kobject_put(). Something might still "catch" the 0-reference object
> during that window.
That's only true if the underlying subsystem doesn't have any
serialization itself.
Which I don't think is normal.
IOW, if you have (say) a PCI hotplug model, the PCI layer will have
the pci_hp_mutex, for example, which protects the PCI hotplug slot
lists and the kobj things.
Those locks won't protect kobj races in _general_ (ie there is no
locking between two totally unrelated buses), but they *should*
serialize the case of a device being added within one class. No?
If not, maybe we need to add some class-based serialization.
> I think it just opens an existing race more widely. The race always
> exist becaues another CPU can observe the object between the reference
> going to 0 and the last kobject_del done by kobject_release.
No it really damn well can't, exactly because we should not allow it -
and allowing it would be fundamentally racy.
We should never allow a kobject_get() with a zero reference count.
It's always a *fundamental* bug - see my other email on your 1/2
patch.
So there is no race, because the reference count itself protects the
object. Either another CPU gets it in time (before it went down to
zero), or it went down to zero and it is dead (because at that point,
deletion will have been scheduled.
Note that this is entirely independent of the auto-clean actions,
since even with absolutely zero cleanup, you still have the
fundamental issue of "reference count goes to zero causes the object
to be free'd".
Linus