Re: [PATCH v5 8/9] dax/kmem: add sysfs interface for atomic whole-device hotplug
From: Hannes Reinecke
Date: Thu Jun 25 2026 - 02:18:01 EST
On 6/24/26 4:57 PM, Gregory Price wrote:
There is no atomic mechanism to offline and remove an entireThat looks good, but question remains:
multi-block DAX kmem device. This is presently done in two steps:
1. offline all
2. remove all).
This creates a race condition where another entity operates directly
on the memory blocks and can cause hot-unplug to fail / unbind to
deadlock.
Add a new 'state' sysfs attribute that enables an atomic whole-device
hotplug operation across its entire memory region.
daxX.Y/state mirrors the per-block memoryX/state ABI:
- [offline, online, online_kernel, online_movable]
- "unplugged" - is added specifically for dax0.0/state
The valid writable states include:
- "unplugged": memory blocks are not present
- "online": memory is online, zone chosen by the kernel
- "online_kernel": memory is online in ZONE_NORMAL
- "online_movable": memory is online in ZONE_MOVABLE
Valid transitions:
- unplugged -> online[_kernel|_movable]
- online[_kernel|_movable] -> unplugged
- offline -> unplugged
A device can only be onlined from "unplugged", so it must be returned
there before being onlined into a different state.
For backwards compatibility the memory blocks are always created at
probe - existing tools expect them to be present after kmem binds.
"offline" is therefore a reportable state but is not writable: it only
arises from the legacy auto_online_blocks=offline policy. Onlining
such a device through this attribute requires unplugging it first in
an effort to get drivers creating DAX devices to set a default.
Unplug is atomic across the whole device: dax_kmem_do_hotremove()
collects every added range and offlines/removes them in one operation.
Either the operation succeeds or is entirely rolled back.
Unbind Note:
We used to call remove_memory() during unbind, which would fire a
BUG() if any of the memory blocks were online at that time. We lift
this into a WARN in the cleanup routine and don't attempt hotremove
if ->state is not DAX_KMEM_UNPLUGGED or MMOP_OFFLINE.
An offline dax device memory is removed on unbind as before.
If online at unbind, the resources are leaked (as before), but now
we prevent deadlock if a memory region is impossible to hotremove.
Suggested-by: Hannes Reinecke <hare@xxxxxxx>
Suggested-by: David Hildenbrand <david@xxxxxxxxxx>
Signed-off-by: Gregory Price <gourry@xxxxxxxxxx>
---
Documentation/ABI/testing/sysfs-bus-dax | 26 +++
drivers/base/memory.c | 9 +
drivers/dax/kmem.c | 224 ++++++++++++++++++++----
include/linux/memory_hotplug.h | 1 +
4 files changed, 224 insertions(+), 36 deletions(-)
Why do we need to treat the 'unbind' call as a given thing?
If we know that we cannot handle online memory during unbind,
can't we just disallow unbind in that case?
I don't think it's too much to ask from an admin to offline
the memory first, _especially_ as now we have a simple knob
to do that ...
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@xxxxxxx +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich