Re: [PATCH] acpi: drop support for force_remove
From: Rafael J. Wysocki
Date: Tue Apr 11 2017 - 09:48:14 EST
On Tue, Apr 11, 2017 at 2:20 PM, Michal Hocko <mhocko@xxxxxxxxxx> wrote:
> On Tue 11-04-17 00:15:42, Rafael J. Wysocki wrote:
>> On Monday, April 10, 2017 07:13:44 PM Michal Hocko wrote:
>> > On Mon 03-04-17 09:40:23, Michal Hocko wrote:
>> > > From: Michal Hocko <mhocko@xxxxxxxx>
>> > >
>> > > /sys/firmware/acpi/hotplug/force_remove was presumably added to support
>> > > auto offlining in the past. This is, however, inherently dangerous for
>> > > some hotplugable resources like memory. The memory offlining fails when
>> > > the memory is still in use and cannot be dropped or migrated. If we
>> > > ignore the failure we are basically allowing for subtle memory
>> > > corruption or a crash.
>> > >
>> > > We have actually noticed the later while hitting BUG() during the memory
>> > > hotremove (remove_memory):
>> > > ret = walk_memory_range(PFN_DOWN(start), PFN_UP(start + size - 1), NULL,
>> > > check_memblock_offlined_cb);
>> > > if (ret)
>> > > BUG();
>> > >
>> > > it took us quite non-trivial time realize that the customer had
>> > > force_remove enabled. Even if the BUG was removed here and we could
>> > > propagate the error up the call chain it wouldn't help at all because
>> > > then we would hit a crash or a memory corruption later and harder to
>> > > debug. So force_remove is unfixable for the memory hotremove. We haven't
>> > > checked other hotplugable resources to be prone to a similar problems.
>> > >
>> > > Remove the force_remove functionality because it is not fixable currently.
>> > > Keep the sysfs file and report an error if somebody tries to enable it.
>> > > Encourage users to report about the missing functionality and work with
>> > > them with an alternative solution.
>> > >
>> > > Reviewed-by: Lee, Chun-Yi <jlee@xxxxxxxx>
>> > > Signed-off-by: Michal Hocko <mhocko@xxxxxxxx>
>> >
>> > ping on this?
>>
>> /shrug/
>>
>> I'll apply it if nobody has any problems with it.
>
> It is definitely your call but how long are we going to wait
> (considering that the current behavior is obviously broken wrt memory
> hotplug)?
A couple of days more? Surely not going to push it into 4.11 at this
stage as it is an ABI change.
Thanks,
Rafael