Re: [RFC PATCH v5 2/2] Use kernfs_break_active_protection() for device online store callbacks

From: Rafael J. Wysocki
Date: Fri Apr 25 2014 - 08:47:25 EST


On 4/25/2014 3:46 AM, Li Zhong wrote:
On Thu, 2014-04-24 at 12:02 +0200, Rafael J. Wysocki wrote:
On 4/24/2014 10:59 AM, Li Zhong wrote:
On Wed, 2014-04-23 at 18:12 +0200, Rafael J. Wysocki wrote:
On 4/23/2014 4:23 PM, Tejun Heo wrote:
Hello, Rafael.
Hi,

On Wed, Apr 23, 2014 at 12:21:33AM +0200, Rafael J. Wysocki wrote:
Can you please elaborate a bit?
Because it can get involved in larger locking dependency issues by
joining dependency graphs of two otherwise largely disjoint
subsystems. It has potential to create possible deadlocks which don't
need to exist.
Well, I do my best not to add unnecessary locks if that's what you mean.

It is there to protect hotplug operations involving multiple devices
(in different subsystems) from racing with each other. Why exactly
is it bad?
But why would different subsystems, say cpu and memory, use the same
lock? Wouldn't those subsystems already have proper locking inside
their own subsystems?
That locking is not sufficient.

Why add this additional global lock across multiple subsystems?
That basically is because of how eject works when it is triggered via ACPI.

It is signaled for a device at the top of a subtree. It may be a
container of some sort and the eject involves everything below that
device in the ACPI namespace. That may involve multiple subsystem
(CPUs, memory, PCI host bridge, etc.).

We do that in two steps, offline (which can fail) and eject proper
(which can't fail and makes all of the involved devices go away). All
that has to be done in one go with respect to the sysfs-triggered
offline/online and that's why the lock is there.
Thank you for the education. It's more clear to me now why we need this
lock.

I still have some small questions about when this lock is needed:

I could understand sysfs-triggered online is not acceptable when
removing devices in multiple subsystems. But maybe concurrent offline
and remove(with proper per subsystem locks) seems not harmful?

And if we just want to remove some devices in a specific subsystem, e.g.
like writing /cpu/release, if it just wants to offline and remove some
cpus, then maybe we don't require the device_hotplug_lock to be taken?
No and no.

If the offline phase fails for any device in the subtree, we roll back
the operation
and online devices that have already been offlined by it. Also the ACPI
hot-addition
needs to acquire device_hotplug_lock so that it doesn't race with ejects
and so
that lock needs to be taken by sysfs-triggered offline too.
I can understand that hot-addition needs the device_hotplug lock, but
still not very clear about the offline.

I guess your are describing following scenario:

user A: (trying remove cpu 1 and memory 1-10)

lock_device_hotplug
offline cpu with cpu locks -- successful
offline memories with memory locks -- failed, e.g. for memory8
online cpu and memory with their locks
unlock_device_hotplug

What about if all is successful and CPU1 is gone before device_hotplug_lock is released?

user B: (trying offline cpu 1)

offline cpu with cpu locks

But I don't see any problem for user B not taking the device_hotplug
lock. The result may be different for user B to take or not take the
lock. But I think it could be seen as concurrent online/offline for cpu1
under cpu hotplug locks, which just depends on which is executed last?

Or did I miss something here?

Yes, you could do offline in parallel with user A without taking device_hotplug_lock, but the result may be surprising to user B then.

With device _hotplug_lock user B will always see CPU1 off line (or gone) after his offline in this scenario, while without taking the lock user B may sometimes see CPU1 on line after his offline. I don't think that's a good thing.

Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/