Re: [PATCH 1/2] mm: add private lock to serialize memory hotplug operations

From: Rafael J. Wysocki
Date: Thu Mar 09 2017 - 17:49:10 EST


On Thursday, March 09, 2017 02:37:55 PM Dan Williams wrote:
> On Thu, Mar 9, 2017 at 2:22 PM, Rafael J. Wysocki <rjw@xxxxxxxxxxxxx> wrote:
> > On Thursday, March 09, 2017 11:15:47 PM Rafael J. Wysocki wrote:
> >> On Thursday, March 09, 2017 10:10:31 AM Dan Williams wrote:
> >> > On Thu, Mar 9, 2017 at 5:39 AM, Rafael J. Wysocki <rjw@xxxxxxxxxxxxx> wrote:
> [..]
> >> > I *think* we're ok in this case because unplugging the CPU package
> >> > that contains a persistent memory device will trigger
> >> > devm_memremap_pages() to call arch_remove_memory(). Removing a pmem
> >> > device can't fail. It may be held off while pages are pinned for DMA
> >> > memory, but it will eventually complete.
> >>
> >> What about the offlining, though? Is it guaranteed that no memory from those
> >> ranges will go back online after the acpi_scan_try_to_offline() call in
> >> acpi_scan_hot_remove()?
> >
> > My point is that after the acpi_evaluate_ej0() in acpi_scan_hot_remove() the
> > hardware is physically gone, so if anything is still doing DMA to that memory at
> > that point, then the user is going to be unhappy.
>
> Hmm, ACPI 6.1 does not have any text about what _EJ0 means for ACPI0012.

ACPI0012 is exceptional, but in general _EJ0 does not have to be present under
a particular device for it to be affected. It can be under the device's parent, for
example, in which case the entire subtree under a device with _EJ0 goes away in
one go. And that very well may mean disconnect at the physical level (voltage
goes away IOW).

Thanks,
Rafael