Re: [PATCH RFC 1/2] drivers/base: export lock_device_hotplug/unlock_device_hotplug
From: David Hildenbrand
Date: Fri Aug 17 2018 - 07:56:44 EST
On 17.08.2018 13:28, Heiko Carstens wrote:
> On Fri, Aug 17, 2018 at 01:04:58PM +0200, David Hildenbrand wrote:
>>>> If there are no objections, I'll go into that direction. But I'll wait
>>>> for more comments regarding the general concept first.
>>>
>>> It is the middle of the merge window, and maintainers are really busy
>>> right now. I doubt you will get many review comments just yet...
>>>
>>
>> This has been broken since 2015, so I guess it can wait a bit :)
>
> I hope you figured out what needs to be locked why. Your patch description
> seems to be "only" about locking order ;)
Well I hope so, too ... but there is a reason for the RFC mark ;) There
is definitely a lot of magic in the current code. And that's why it is
also not that obvious that locking is wrong.
To avoid/fix the locking order problem was the motivation for the
original patch that dropped mem_hotplug_lock on one path. So I focused
on that in my description.
>
> I tried to figure out and document that partially with 55adc1d05dca ("mm:
> add private lock to serialize memory hotplug operations"), and that wasn't
> easy to figure out. I was especially concerned about sprinkling
Haven't seen that so far as that was reworked by 3f906ba23689
("mm/memory-hotplug: switch locking to a percpu rwsem"). Thanks for the
pointer. There is a long history to all this.
> lock/unlock_device_hotplug() calls, which has the potential to make it the
> next BKL thing.
Well, the thing with memory hotplug and device_hotplug_lock is that
a) ACPI already holds it while adding/removing memory via add_memory()
b) we hold it during online/offline of memory (via sysfs calls to
device_online()/device_offline())
So it is already pretty much involved in all memory hotplug/unplug
activities on x86 (except paravirt). And as far as I understand, there
are good reasons to hold the lock in core.c and ACPI. (as mentioned by
Rafael)
The exceptions are add_memory() called on s390x, hyper-v, xen and ppc
(including manual probing). And device_online()/device_offline() called
from the kernel.
Holding device_hotplug_lock when adding/removing memory from the system
doesn't sound too wrong (especially as devices are created/removed). At
least that way (documenting and following the rules in the patch
description) we might at least get locking right.
I am very open to other suggestions (but as noted by Greg, many
maintainers might be busy by know).
E.g. When adding the memory block devices, we know that there won't be a
driver to attach to (as there are no drivers for the "memory" subsystem)
- the bus_probe_device() function that takes the device_lock() could
pretty much be avoided for that case. But burying such special cases
down in core driver code definitely won't make locking related to memory
hotplug easier.
Thanks for having a look!
--
Thanks,
David / dhildenb