Re: [PATCH v2 0/6] alloc_tag: introduce IOCTL-based filtering for MAP

From: Suren Baghdasaryan

Date: Wed Jun 03 2026 - 15:54:14 EST


On Mon, May 25, 2026 at 12:33 AM Hao Ge <hao.ge@xxxxxxxxx> wrote:
>
> Hi Andrew and Suren
>
>
> On 2026/5/23 04:11, Andrew Morton wrote:
> > On Fri, 22 May 2026 17:45:32 +0000 Abhishek Bapat <abhishekbapat@xxxxxxxxxx> wrote:
> >
> >> Currently, memory allocation profiling data is primarily exposed through
> >> /proc/allocinfo. While useful for manual inspection, this text-based
> >> interface poses challenges for production monitoring and large-scale
> >> analysis:
> >>
> >> 1. Userspace must parse large amounts of text to extract specific
> >> fields.
> >> 2. To find specific tags, userspace must read the entire dataset,
> >> requiring many context switches and high data copying.
> >> 3. The kernel currently aggregates per-CPU counters for every allocation
> >> size, even those the user intends to filter out immediately.
> >>
> >> This series introduces a new IOCTL-based binary interface for allocinfo
> >> that supports kernel-side filtering. By allowing the user to specify a
> >> filter mask, we significantly reduce the work performed in-kernel and
> >> the amount of data transferred to userspace.
> >>
> >> Performance measurements were conducted on an Intel Xeon Platinum 8481C
> >> (224 CPUs) with caches dropped before each run.
> >>
> >> The IOCTL mechanism shows a ~20x performance improvement for
> >> filtered queries. The kernel avoids the expensive per-CPU counter
> >> aggregation (alloc_tag_read) for any tags that fail the initial string
> >> or location filters.
> >>
> >> Scenario 1: Specific File Filtering (arch/x86/events/rapl.c)
> >> 1. Traditional (cat /proc/allocinfo | grep): 22ms (sys)
> >> 2. IOCTL Interface: 1ms (sys)
> >>
> >> Scenario 2: Compound Filtering (Filename + Size)
> >> 1. Traditional: (cat ... | grep | awk): 21ms (sys)
> >> 2. IOCTL Interface: 1ms (sys)
> >>
> >> Scenario 3: Size-Based Filtering (min_size = 1MB)
> >> 1. Traditional: (cat ... | awk): 21ms (sys)
> >> 2. IOCTL Interface: 14ms (sys)
> > Yup, textual interfaces aren't fast.
> >
> > And ioctl-baed interfaces aren't popular. One would prefer to see an
> > interface which uses read()/lseek(), pread(), etc. It would be
> > appropriate for this [0/N] to have a discussion of why that approach
> > was not chosen.
> >
> >> .../userspace-api/ioctl/ioctl-number.rst | 2 +
> >> MAINTAINERS | 2 +
> >> include/linux/codetag.h | 1 +
> >> include/uapi/linux/alloc_tag.h | 87 +++
> >> lib/alloc_tag.c | 303 ++++++++++-
> >> lib/codetag.c | 11 +
> >> tools/testing/selftests/alloc_tag/Makefile | 9 +
> >> .../alloc_tag/allocinfo_ioctl_test.c | 505 ++++++++++++++++++
> >> 8 files changed, 918 insertions(+), 2 deletions(-)
> >> create mode 100644 include/uapi/linux/alloc_tag.h
> >> create mode 100644 tools/testing/selftests/alloc_tag/Makefile
> >> create mode 100644 tools/testing/selftests/alloc_tag/allocinfo_ioctl_test.c
> > At some point this should grow user-facing documentation, please.
> >
> > And the right time for that is now, because such documentation is
> > useful for code review - it makes that review both easier and more
> > useful.
> >
> > Sashiko had a few things to say:
> >
> > https://sashiko.dev/#/patchset/cover.1779471082.git.abhishekbapat@xxxxxxxxxx
>
> I notice that Sashiko has reported a pre-existing issue, as described below:
>
>
> > static void *allocinfo_start(struct seq_file *m, loff_t *pos)
> This is a pre-existing issue, but can resuming a sequential read on
> /proc/allocinfo cause a use-after-free if a kernel module is unloaded
> between read() system calls?
> The seq_file read operation updates priv->iter.ct during allocinfo_next(),
> stops iteration, and returns to userspace. If the module containing
> priv->iter.ct is unloaded while the lock is dropped, the module's codetag
> memory is freed.
> On the next read() system call, allocinfo_start() with pos > 0 reacquires
> the lock but returns priv without validating if priv->iter.ct still belongs
> to a valid module. Does allocinfo_show() then dereference this dangling
> pointer?
> [ ... ]
>
> This issue is unrelated to the current patch series and can be resolved
>
> by reverting commit 9f44df50fee4.
>
> Therefore, I have submitted a separate patch addressing this issue,
>
> which is available at the link below:
>
> https://lore.kernel.org/all/20260525072117.112779-1-hao.ge@xxxxxxxxx/

Thanks Hao! I commented on your patch, please take a look. I think
there is a better fix.

>
> Thanks
>
> Best Regards
>
> Hao
>