Re: Please add to stable: module: don't unlink the module until we've removed all exposure.

From: Rusty Russell
Date: Wed Jun 05 2013 - 03:38:32 EST


Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> writes:
> On Mon, Jun 03, 2013 at 10:17:17AM -0400, Joe Lawrence wrote:
>> [Cc: stable@xxxxxxxxxxxxxxx]
>>
>> Third time is a charm? The stable address was incorrect from the first
>> msg in this thread, but the relevant bits remain quoted below...
>
> Really? I'm totally confused...
>
>> On Mon, 3 Jun 2013, Joe Lawrence wrote:
>>
>> > [fixing Cc: stable@xxxxxxxxxx address]
>> >
>> > On Sun, 2 Jun 2013, Joe Lawrence wrote:
>> >
>> > > On Sun, 2 Jun 2013, Rusty Russell wrote:
>> > >
>> > > > Ben Greear <greearb@xxxxxxxxxxxxxxx> writes:
>> > > >
>> > > > > It turns out, the bug I spent yesterday chasing in various 3.9 kernels is apparently
>> > > > > fixed by the commit in the title (c9c390bb5535380d40614571894ef0c00bc026ff).
>> > > >
>> > > > Apparently being the operative word.
>> > > >
>> > > > This commit avoids the entire "module insert failed due to sysfs race"
>> > > > path in the common case, it doesn't fix any actual problem.
>> > > >
>> > > > I think the real commit you want is Linus' kobject fix
>> > > > a49b7e82cab0f9b41f483359be83f44fbb6b4979 "kobject: fix kset_find_obj()
>> > > > race with concurrent last kobject_put()".
>> > > >
>> > > > Or is that already in stable?
>> > >
>> > > Hi Rusty,
>> > >
>> > > I had pointed Ben (offlist) to that bugzilla entry without realizing
>> > > there were other earlier related fixes in this space. Re-viewing bz-
>> > > 58011, it looks like it was opened against 3.8.12, while Ben and myself
>> > > had encountered module loading problems in versions 3.9 and
>> > > 3.9.[1-3]. I can update the bugzilla entry to add a comment noting commit
>> > > a49b7e82 "kobject: fix kset_find_obj() race with concurrent last
>> > > kobject_put()".
>> > >
>> > > That said, it doesn't appear that commit 944a1fa "module: don't unlink the
>> > > module until we've removed all exposure" has not made it into any stable
>> > > kernel. On my system, applying this on top of 3.9 resolved a module
>> > > unload/load race that would occasionally occur on boot (two video adapters
>> > > of the same make, the module unloads for whatever reason and I see "module
>> > > is already loaded" and "sysfs: cannot create duplicate filename
>> > > '/module/mgag200'" messages every 5-10% instances.) I have logs if you
>> > > were interested in these warnings/crashes.
>> > >
>> > > Hope this clarifies things.
>
> After this whole thread, what should I be doing for the 3.9-stable tree?
> Add commit 944a1fa? Or something else?

Yes. It does fix an Oops unrelated to what it was intended to fix, so
it's the lowest pain path.

There may be other ways of triggering a similar oops, but do far the
obvious attempt has failed (holding a sysfs file open while a module
fails its init). I might patch it anyway, because it makes me
uncomfortable, but that's separate.

Thanks,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/