Re: [PATCH v7] mm: slub: move sysfs slab alloc/free interfaces to debugfs

From: Vlastimil Babka
Date: Wed May 26 2021 - 07:40:40 EST


On 5/25/21 9:38 AM, Faiyaz Mohammed wrote:
> alloc_calls and free_calls implementation in sysfs have two issues,
> one is PAGE_SIZE limitiation of sysfs and other is it does not adhere
> to "one value per file" rule.
>
> To overcome this issues, move the alloc_calls and free_calls implemeation
> to debugfs.
>
> Rename the alloc_calls/free_calls to alloc_traces/free_traces,
> to be inline with what it does.
>
> Reported-by: kernel test robot <lkp@xxxxxxxxx>
> Reported-by: Dan Carpenter <dan.carpenter@xxxxxxxxxx>

These were IIRC bot reports for some bugs in the previous versions, so keeping
the Reported-by: for the whole patch is misleading - these were not reports for
the sysfs issues this patch fixes by moving the files to debugfs.

> Signed-off-by: Faiyaz Mohammed <faiyazm@xxxxxxxxxxxxxx>
> ---
> changes in V7:
> - Drop the older alloc_calls and free_calls interface.
> changes in v6:
> - https://lore.kernel.org/linux-mm/1621341949-26762-1-git-send-email-faiyazm@xxxxxxxxxxxxxx/
>
> changes in v5:
> - https://lore.kernel.org/linux-mm/1620296523-21922-1-git-send-email-faiyazm@xxxxxxxxxxxxxx/
>
> changes in v4:
> - https://lore.kernel.org/linux-mm/1618583239-18124-1-git-send-email-faiyazm@xxxxxxxxxxxxxx/
>
> changes in v3:
> - https://lore.kernel.org/linux-mm/1617712064-12264-1-git-send-email-faiyazm@xxxxxxxxxxxxxx/
>
> changes in v2:
> - https://lore.kernel.org/linux-mm/3ac1d3e6-6207-96ad-16a1-0f5139d8b2b5@xxxxxxxxxxxxxx/
>
> changes in v1:
> - https://lore.kernel.org/linux-mm/1610443287-23933-1-git-send-email-faiyazm@xxxxxxxxxxxxxx/
>
> include/linux/slub_def.h | 8 ++
> mm/slab_common.c | 9 ++
> mm/slub.c | 353 ++++++++++++++++++++++++++++++++++-------------
> 3 files changed, 276 insertions(+), 94 deletions(-)

I don't see any of the symlinks under /sys/kernel/debug/slab/, so I think the
aliases handling code is wrong, and I can see at least two reasons why it could be:

> @@ -4525,6 +4535,8 @@ __kmem_cache_alias(const char *name, unsigned int size, unsigned int align,
> s->refcount--;
> s = NULL;
> }
> +
> + debugfs_slab_alias(s, name);

Here you might be calling debugfs_slab_alias() with NULL if the
sysfs_slab_alias() above returned true.

> }
>
> return s;

...

> +static int __init slab_debugfs_init(void)
> +{
> + struct kmem_cache *s;
> +
> + slab_debugfs_root = debugfs_create_dir("slab", NULL);
> +
> + slab_state = FULL;
> +
> + list_for_each_entry(s, &slab_caches, list)
> + debugfs_slab_add(s);
> +
> + while (alias_list) {
> + struct saved_alias *al = alias_list;

alias_list a single list and both slab_sysfs_init() and slab_debugfs_init()
flush it. So only the init call that happens to be called first, does actually
find an unflushed list. I think you
need to use a separate list for debugfs (simpler) or a shared list with both
sysfs and debugfs processing (probably more complicated).

And finally a question, perhaps also for Greg. With sysfs, we hand out the
lifecycle of struct kmem_cache to sysfs, to ensure we are not reading sysfs
files of a cache that has been removed.

But with debugfs, what are the guarantees that things won't blow up when a
debugfs file is being read while somebody calls kmem_cache_destroy() on the cache?

> +
> + alias_list = alias_list->next;
> +
> + debugfs_slab_alias(al->s, al->name);
> +
> + kfree(al);
> + }
> +
> + return 0;
> +
> +}
> +__initcall(slab_debugfs_init);
> +#endif
> /*
> * The /proc/slabinfo ABI
> */
>