Re: [Patch v2] sysfs: add lockdep class support to s_active

From: Xiaotian Feng
Date: Fri Feb 05 2010 - 05:01:08 EST


On Fri, Feb 5, 2010 at 5:39 PM, Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote:
> Xiaotian Feng <xtfeng@xxxxxxxxx> writes:
>
>> On Fri, Feb 5, 2010 at 2:42 PM, Amerigo Wang <amwang@xxxxxxxxxx> wrote:
>>> Recently we met a lockdep warning from sysfs during s2ram or cpu hotplug.
>>> As reported by several people, it is something like:
>>>
>>> [ 6967.926563] ACPI: Preparing to enter system sleep state S3
>>> [ 6967.956156] Disabling non-boot CPUs ...
>>> [ 6967.970401]
>>> [ 6967.970408] =============================================
>>> [ 6967.970419] [ INFO: possible recursive locking detected ]
>>> [ 6967.970431] 2.6.33-rc2-git6 #27
>>> [ 6967.970439] ---------------------------------------------
>>> [ 6967.970450] pm-suspend/22147 is trying to acquire lock:
>>> [ 6967.970460] Â(s_active){++++.+}, at: [<c10d2941>]
>>> sysfs_hash_and_remove+0x3d/0x4f
>>> [ 6967.970493]
>>> [ 6967.970497] but task is already holding lock:
>>> [ 6967.970506] Â(s_active){++++.+}, at: [<c10d4110>]
>>> sysfs_get_active_two+0x16/0x36
>>> [...]
>>>
>>> Eric already provides a patch for this[1], but it still can't fix the
>>> problem. Based on his work and Peter's suggestion, I write this patch,
>>> hopefully we can fix the warning completely.
>>>
>>> This patch put sysfs s_active into two classes, one is for PM, the other
>>> is for the rest, so lockdep will distinguish them.
>>
>> I think this patch does not hit the root cause, we have a similiar
>> warning which is not related with PM.
>
> The root cause is that our locking is crazy complicated. ÂNo lockdep
> changes are going to fix that.
>
> What we can do and what the patch does is teach lockdep to treat some
> of the sysfs files as a different group (subclass) from other sysfs
> files. ÂWhich keeps us from overgeneralizing too much and having
> a better signal to noise ratio.
>
> As for the block device problem goes, I can't easily say that
> the block layer is correct. ÂI expect it is because changing
> the scheduler is unlikely to delete block devices. ÂIf the block layer
> has bugs then adding another subclass as Amerigo suggests should simply
> make lockdep warnings harder to trigger and more accurate so that
> sounds like a path worth walking.
>
> In general I recommend that pieces of code that need to do a lot of
> work in a sysfs attribute consider using a work queue or a kernel
> thread, as that can be easier to analyze.

PM case
store /sys/devices/system/cpu1/online
remove /sys/devices/system/cpu1/cache/

iosched case
store /sys/block/sdx/queue/scheduler
remove /sys/block/sdx/queue/iosched/

So it looks like this is from sysfs layer ....

>
> Eric
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/