Re: [PATCH v2 0/3] ACPI: New eject flow to remove devices cautiously

From: Chester Lin
Date: Wed Jul 03 2019 - 06:44:06 EST


On Wed, Jul 03, 2019 at 10:14:39AM +0000, Chester Lin wrote:
> Currently there are two ways to handle ACPI device ejection. When an eject
> event happens on a container, the kernel just sends KOBJ_CHANGE to
> userland and userland should handle offline operation. For other device
> types, acpi_scan_try_to_offline() is called and it tries to put target
> device(s) offline and then removes all nodes once they are all offline.
>
> However we found that sometimes applications could intensively access
> resources on ejectable devices therefore they could have risk if ejection
> suddenly happens and removes devices without any notification. In stead
> of executing the offline callbakcs directly, we want to introduce a new
> approach, which sends change events to notify all target nodes beforehand
> and hands over offline handling to userland so that userland can have a
> chance to schedule an offline task based on current workload. The online
> function to recover from failure is also changed, it follows the same
> approach to send change events rather than putting devices online directly
> , which means userland will also need to take care of online handling.
>
> To ensure that eject function can work properly since normal users might
> not have their own offline/online handling, we will submit a generic udev
> rule to systemd upstream as default in order to deal with change events
> and take [offline/online] action accordingly. But the Hot-Removing part
> still remains so the hotplug function can run to it once target nodes are
> all offline.
>


Here are default rules we are going to propose:

# 80-acpi-hotplug-eject.rules
# Generic rules for handling ACPI hotplug eject.

SUBSYSTEM=="*", ACTION=="change", ENV{EVENT}=="offline", ATTR{online}=="1", \
DEVPATH=="*", ATTR{online}="0"

SUBSYSTEM=="*", ACTION=="change", ENV{EVENT}=="online", ATTR{online}=="0", \
DEVPATH=="*", ATTR{online}="1"


> To easily monitor eject status and start over an eject process, there's a
> status trace mechanism in this eject flow, which helps to count current
> online devices under the ejectable target, and it can reschedule an eject
> event when all nodes within the device tree have been put offline.
>
> v2:
> - device_sysfs: Add descriptions in /Document/ABI/testing/sysfs-bus-acpi
> - device_sysfs: Replace the declartion with DEVICE_ATTR_RW and add cancel
> option in eject_store.
> - scan: Add a retry mechanism when userspace fail to put device offline.
> - scan: Add ready-to-remove state.
>
> Chester Lin (3):
> ACPI / hotplug: Send change events for offline/online requests when
> eject is triggered
> ACPI / hotplug: Eject status trace and auto-remove approach
> ACPI / device_sysfs: Add eject_show and add a cancel option in
> eject_store
>
> Documentation/ABI/testing/sysfs-bus-acpi | 9 +-
> drivers/acpi/container.c | 2 +-
> drivers/acpi/device_sysfs.c | 94 ++++++-
> drivers/acpi/glue.c | 146 +++++++++++
> drivers/acpi/internal.h | 34 ++-
> drivers/acpi/scan.c | 318 +++++++++++++++++------
> drivers/base/core.c | 4 +
> include/acpi/acpi_bus.h | 3 +-
> include/linux/acpi.h | 6 +
> 9 files changed, 523 insertions(+), 93 deletions(-)
>
> --
> 2.20.1
>