Re: [PATCH v7 2/3] mm/mempolicy: Prepare weighted interleave sysfs for memory hotplug

From: Dan Williams
Date: Tue Apr 08 2025 - 23:44:47 EST


Rakie Kim wrote:
> Previously, the weighted interleave sysfs structure was statically
> managed during initialization. This prevented new nodes from being
> recognized when memory hotplug events occurred, limiting the ability
> to update or extend sysfs entries dynamically at runtime.
>
> To address this, this patch refactors the sysfs infrastructure and
> encapsulates it within a new structure, `sysfs_wi_group`, which holds
> both the kobject and an array of node attribute pointers.
>
> By allocating this group structure globally, the per-node sysfs
> attributes can be managed beyond initialization time, enabling
> external modules to insert or remove node entries in response to
> events such as memory hotplug or node online/offline transitions.
>
> Instead of allocating all per-node sysfs attributes at once, the
> initialization path now uses the existing sysfs_wi_node_add() and
> sysfs_wi_node_delete() helpers. This refactoring makes it possible
> to modularly manage per-node sysfs entries and ensures the
> infrastructure is ready for runtime extension.
>
> Signed-off-by: Rakie Kim <rakie.kim@xxxxxx>
> Signed-off-by: Honggyu Kim <honggyu.kim@xxxxxx>
> Signed-off-by: Yunjeong Mun <yunjeong.mun@xxxxxx>
> Reviewed-by: Gregory Price <gourry@xxxxxxxxxx>
> ---
> mm/mempolicy.c | 61 ++++++++++++++++++++++++--------------------------
> 1 file changed, 29 insertions(+), 32 deletions(-)
>
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 0da102aa1cfc..988575f29c53 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -3419,6 +3419,13 @@ struct iw_node_attr {
> int nid;
> };
>
> +struct sysfs_wi_group {
> + struct kobject wi_kobj;
> + struct iw_node_attr *nattrs[];
> +};
> +
> +static struct sysfs_wi_group *wi_group;
> +
> static ssize_t node_show(struct kobject *kobj, struct kobj_attribute *attr,
> char *buf)
> {
> @@ -3461,27 +3468,24 @@ static ssize_t node_store(struct kobject *kobj, struct kobj_attribute *attr,
> return count;
> }
>
> -static struct iw_node_attr **node_attrs;
> -
> -static void sysfs_wi_node_release(struct iw_node_attr *node_attr,
> - struct kobject *parent)
> +static void sysfs_wi_node_delete(int nid)
> {
> - if (!node_attr)
> + if (!wi_group->nattrs[nid])
> return;
> - sysfs_remove_file(parent, &node_attr->kobj_attr.attr);
> - kfree(node_attr->kobj_attr.attr.name);
> - kfree(node_attr);
> +
> + sysfs_remove_file(&wi_group->wi_kobj,
> + &wi_group->nattrs[nid]->kobj_attr.attr);

This still looks broken to me, but I think this is more a problem that
was present in the original code.

At this point @wi_group's reference count is zero because
sysfs_wi_release() has been called. However, it can only be zero if it has
properly transitioned through kobject_del() and final kobject_put(). It
follows that kobject_del() arranges for kobj->sd to be NULL. That means
that this *should* be hitting the WARN() in kernfs_remove_by_name_ns()
for the !parent case.

So, either you are not triggering that path, or testing that path, but
sys_remove_file() of the child attributes should be happening *before*
sysfs_wi_release().

Did I miss something?