[PATCH v5 bpf-next 0/3] bpf: Add kmem_cache iterator and kfunc

From: Namhyung Kim
Date: Thu Oct 10 2024 - 19:25:18 EST


Hello,

I'm proposing a new iterator and a kfunc for the slab memory allocator
to get information of each kmem_cache like in /proc/slabinfo or
/sys/kernel/slab in more flexible way.

v5 changes)

* set PTR_UNTRUSTED for return value of bpf_get_kmem_cache() (Alexei)
* add KF_RCU_PROTECTED to bpf_get_kmem_cache(). See below. (Song)
* add WARN_ON_ONCE and comment in kmem_cache_iter_seq_next() (Song)
* change kmem_cache_iter_seq functions not to call BPF on intermediate stop
* add a subtest to compare the kmem cache info with /proc/slabinfo (Alexei)

v4: https://lore.kernel.org/lkml/20241002180956.1781008-1-namhyung@xxxxxxxxxx

* skip kmem_cache_destroy() in kmem_cache_iter_seq_stop() if possible (Vlastimil)
* fix a bug in the kmem_cache_iter_seq_start() for the last entry

v3: https://lore.kernel.org/lkml/20241002065456.1580143-1-namhyung@xxxxxxxxxx/

* rework kmem_cache_iter not to hold slab_mutex when running BPF (Alexei)
* add virt_addr_valid() check (Alexei)
* fix random test failure by running test with the current task (Hyeonggon)

v2: https://lore.kernel.org/lkml/20240927184133.968283-1-namhyung@xxxxxxxxxx/

* rename it to "kmem_cache_iter"
* fix a build issue
* add Acked-by's from Roman and Vlastimil (Thanks!)
* add error codes in the test for debugging

v1: https://lore.kernel.org/lkml/20240925223023.735947-1-namhyung@xxxxxxxxxx/

My use case is `perf lock contention` tool which shows contended locks
but many of them are not global locks and don't have symbols. If it
can tranlate the address of the lock in a slab object to the name of
the slab, it'd be much more useful.

I'm not aware of type information in slab yet, but I was told there's
a work to associate BTF ID with it. It'd be definitely helpful to my
use case. Probably we need another kfunc to get the start address of
the object or the offset in the object from an address if the type
info is available. But I want to start with a simple thing first.

The kmem_cache_iter iterates kmem_cache objects under slab_mutex and
will be useful for userspace to prepare some work for specific slabs
like setting up filters in advance. And the bpf_get_kmem_cache()
kfunc will return a pointer to a slab from the address of a lock.

Actualy I'm not sure about the RCU lock - IIUC it doesn't protect the
kmem_cache itself but kmem_cache_destroy() calls some RCU barrier
functions, so having RCU read lock would protect the object from going
away by kfree_rcu() or something and then kmem_cache. But please
correct me if I'm wrong.

And the test code is to read from the iterator and make sure it finds
a slab cache of the task_struct for the current task.

The code is available at 'bpf/slab-iter-v5' branch in
https://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Thanks,
Namhyung


Namhyung Kim (3):
bpf: Add kmem_cache iterator
mm/bpf: Add bpf_get_kmem_cache() kfunc
selftests/bpf: Add a test for kmem_cache_iter

include/linux/btf_ids.h | 1 +
kernel/bpf/Makefile | 1 +
kernel/bpf/helpers.c | 1 +
kernel/bpf/kmem_cache_iter.c | 175 ++++++++++++++++++
kernel/bpf/verifier.c | 5 +
mm/slab_common.c | 19 ++
.../bpf/prog_tests/kmem_cache_iter.c | 115 ++++++++++++
tools/testing/selftests/bpf/progs/bpf_iter.h | 7 +
.../selftests/bpf/progs/kmem_cache_iter.c | 95 ++++++++++
9 files changed, 419 insertions(+)
create mode 100644 kernel/bpf/kmem_cache_iter.c
create mode 100644 tools/testing/selftests/bpf/prog_tests/kmem_cache_iter.c
create mode 100644 tools/testing/selftests/bpf/progs/kmem_cache_iter.c


base-commit: 5bd48a3a14df4b3ee1be0757efcc0f40d4f57b35
--
2.47.0.rc1.288.g06298d1525-goog