Re: Please add to stable: module: don't unlink the module until we've removed all exposure.

From: Rusty Russell
Date: Tue Jun 04 2013 - 02:01:20 EST


Ben Greear <greearb@xxxxxxxxxxxxxxx> writes:
> On 06/03/2013 08:59 AM, Ben Greear wrote:
>> On 06/03/2013 07:17 AM, Joe Lawrence wrote:
>>
>>>>> Hi Rusty,
>>>>>
>>>>> I had pointed Ben (offlist) to that bugzilla entry without realizing
>>>>> there were other earlier related fixes in this space. Re-viewing bz-
>>>>> 58011, it looks like it was opened against 3.8.12, while Ben and myself
>>>>> had encountered module loading problems in versions 3.9 and
>>>>> 3.9.[1-3]. I can update the bugzilla entry to add a comment noting commit
>>>>> a49b7e82 "kobject: fix kset_find_obj() race with concurrent last
>>>>> kobject_put()".
>>>>>
>>>>> That said, it doesn't appear that commit 944a1fa "module: don't unlink the
>>>>> module until we've removed all exposure" has not made it into any stable
>>>>> kernel. On my system, applying this on top of 3.9 resolved a module
>>>>> unload/load race that would occasionally occur on boot (two video adapters
>>>>> of the same make, the module unloads for whatever reason and I see "module
>>>>> is already loaded" and "sysfs: cannot create duplicate filename
>>>>> '/module/mgag200'" messages every 5-10% instances.) I have logs if you
>>>>> were interested in these warnings/crashes.
>>
>> It at least works around the problem for me as well. But, a more rare
>> migration/[0-3] (I think) related lockup still exists in 3.9.4 for me,
>> so I will also try applying that other kobject patch and continue testing
>> today...
>
> Well, that other kobject patch is already in 3.9.4, so I think it's still
> a good idea to include the
> "module: don't unlink the module until we've removed all exposure."
> patch in stable. I have a decent test case to reproduce the crash, so if someone
> wants me to test other patches instead, then I will do so.

I understand your eagerness to have this resolved, but we need to
understand the problem. The fix you asked for in stable was supposed to
be cosmetic, to avoid the sysfs warning. But it did serve to stress the
cleanup path, which may still have lurking bugs!

I reproduced the oops myself on 3.8. I will chase it on 3.9.4, too.

Thanks,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/