Re: [PATCH 1/4] acpi,memory-hotplug : add memory offline code toacpi_memory_device_remove()

From: Wen Congyang
Date: Fri Oct 19 2012 - 05:03:13 EST

At 10/19/2012 03:44 AM, KOSAKI Motohiro Wrote:
>>>>>> + if (type == ACPI_BUS_REMOVAL_EJECT) {
>>>>>> + /*
>>>>>> + * offline and remove memory only when the memory device is
>>>>>> + * ejected.
>>>>>> + */
>>>>> This comment explain nothing. A comment should describe _why_ should we do.
>>>>> e.g. Why REMOVAL_NORMAL and REMOVEL_EJECT should be ignored. Why
>>>>> we need remove memory here instead of ACPI_NOTIFY_EJECT_REQUEST.
>>>> Hmm, we have 2 ways to remove a memory:
>>>> 1. SCI
>>>> 2. echo 1 >/sys/bus/acpi/devices/PNP0C80:XX/eject
>>>> In the 2nd case, there is no ACPI_NOTIFY_EJECT_REQUEST. We should offline
>>>> the memory and remove it from kernel in the release callback. We will poweroff
>>>> the memory device in acpi_bus_hot_remove_device(), so we must offline
>>>> and remove it if the type is ACPI_BUS_REMOVAL_EJECT.
>>>> I guess we should not poweroff the memory device when we fail to offline it.
>>>> But device_release_driver() doesn't returns any error...
>>> 1) I think /sys/bus/acpi/devices/PNP0C80:XX/eject should emulate acpi
>>> eject. Can't
>>> you make a pseudo acpi eject event and detach device by acpi regular path?
>> It is another issue. And we only can implement it here with current acpi
>> implemention. Some other acpi devices(for example: cpu) do it like this.
> Hint: only cpu take like this.
>>> 2) Your explanation didn't explain why we should ignore REMOVAL_NORMAL
>>> and REMOVEL_EJECT. As far as reviewers can't track your intention, we
>>> can't maintain
>>> the code and can't ack them.
>> REMOVAL_NORMAL means the user want to unbind the memory device from this driver.
>> It is no need to eject the device, and we can still use this device after unbinding.
>> So it can be ignored.
>> REMOVAL_EJECT means the user want to eject and remove the device, and we should
>> not use the device. So we should offline and remove the memory here.
> This is not exactly correct, IMHO. Usually, we must not touch unbinded
> device because
> they are out of OS control. If I understand is correct, the main
> reason is to distinguish a
> rollback of driver initialization failure and true ejection.
> REMOVAL_NORMAL is usually used for rollback and REMOVAL_EJECT is used for
> removal device eject. Typical device don't need to distinguish them
> because we should
> deallocate every resource even when driver initialization failure.
> However, cpu and memory are exceptions. They are recognized from kernel before
> driver initialization. Then even if machine have crappy acpi table and
> make failure acpi
> initialization, disabling memory make no sense.

Hmm, IIRC, if the memory is recognized from kerenl before driver initialization,
the memory device is not managed by the driver acpi_memhotplug.

I think we should also deal with REMOVAL_NORMAL here now. Otherwise it will cause
some critical problem: we unbind the device from the driver but we still use
it. If we eject it, we have no chance to offline and remove it. It is very dangerous.

Wen Congyang

> And, when you make _exceptional_ rule, you should comment verbosely in the code
> the detail. likes 1) why we need. 2) which
> device/machine/environment suffer such exception. 2) what affect
> other subsys.
> Even though cpu hotplug has crappy poor comment and document, please
> don't follow
> them.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at