Re: [PATCH v3 0/6] alloc_tag: introduce IOCTL-based filtering for MAP

From: Abhishek Bapat

Date: Tue Jun 09 2026 - 16:48:01 EST


On Mon, Jun 8, 2026 at 5:29 PM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote:
>
> On Mon, Jun 8, 2026 at 5:02 PM Abhishek Bapat <abhishekbapat@xxxxxxxxxx> wrote:
> >
> > On Fri, Jun 5, 2026 at 5:09 PM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> > >
> > > On Fri, 5 Jun 2026 23:36:45 +0000 Abhishek Bapat <abhishekbapat@xxxxxxxxxx> wrote:
> > >
> > > > Currently, memory allocation profiling data is primarily exposed through
> > > > /proc/allocinfo. While useful for manual inspection, this text-based
> > > > interface poses challenges for production monitoring and large-scale
> > > > analysis:
> > > >
> > > > 1. Userspace must parse large amounts of text to extract specific
> > > > fields.
> > > > 2. To find specific tags, userspace must read the entire dataset,
> > > > requiring many context switches and high data copying.
> > > > 3. The kernel currently aggregates per-CPU counters for every allocation
> > > > size, even those the user intends to filter out immediately.
> > > >
> > > > This series introduces a new IOCTL-based binary interface for allocinfo
> > > > that supports kernel-side filtering. By allowing the user to specify a
> > > > filter mask, we significantly reduce the work performed in-kernel and
> > > > the amount of data transferred to userspace.
> > >
> > > Thanks. AI review found several things - you'll want to address at
> > > least the first few.
> > >
> > > https://sashiko.dev/#/patchset/cover.1780701922.git.abhishekbapat@xxxxxxxxxx
> >
> > All, please note I missed attaching the reason for choosing the IOCTL
> > mechanism to this cover letter, but I will attach it to the v4
> > patchset cover letter along with other changes. Thanks!
>
> Can you please add it here now so that we can review that?

I intend to add this to the end of the cover-letter passages, right
before the version change descriptions:

The ioctl() mechanism was chosen for allocinfo to address the per-CPU
counter aggregation performance bottleneck. A traditional read()
operation must report the total allocation count and sizes for every
code tag in the system. Doing so requires iterating across all CPUs to
sum their per-CPU counters for thousands of tags, which introduces
substantial runtime overhead.
The ioctl() interface allows userspace to push selective filtering
criteria directly into the kernel before the per-CPU counter
aggregation. The kernel aggregates per-CPU counters only for a small
subset of tags that match the filter. This results in significant
performance improvement.
Beyond fast filtered retrieval, the ioctl() foundation allows
introducing a context capture mechanism in the future to capture the
context for specific allocations.