Re: [PATCH 0/2] tracing: Have trace_pid_list be a sparse array

From: Masami Hiramatsu
Date: Fri Sep 24 2021 - 09:36:13 EST


On Fri, 24 Sep 2021 09:16:27 -0400
Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:

> On Fri, 24 Sep 2021 12:51:07 +0200
> Eugene Syromyatnikov <evgsyr@xxxxxxxxx> wrote:
>
> Hi Eugene,
>
> > Note that there is only one top-level chunk (so its size doesn't
> > matter much), (usually) about one middle-tier chunk (except for some
> > heavy cases, since pids are allocated linearly), and quite some
> > lower-tier bitset leaves. So I'd optimise towards smaller leaves at
> > the expense of middle-tier (and especially top-tier) chunk size
> > (especially considering the fact that in the kernel, buddy allocator
> > is used), like 12-8-12 or something like that, but I have no factual
>
> What I really like about my 8 8 14 split I have, it makes everything 2K in
> size on 64 bit machines (1K 1K 2K for 32 bit, but who cares ;-)
>
> 1 << 8 * 8 bytes = 2K // top tiers are pointers to lower tiers
> 1 << 14 bits = 2K // lower tier only cares about bits
>
> This means they will likely all be allocated in the same slab.
>
> I'm optimizing the top tiers for size, because they are likely to be empty.
> Why add memory for something that will never be used, and can't be removed.
> Note, the middle and lower tiers can be reused when they go empty, which is
> a likely use case (at least when I test this using hackbench).
>
> > basis for arguing about specific split. Also, I cannot resist from
> > noticing that this reminds me an awful lot of XArray and [1]. Maybe,
> > some wrapper around XArray would do?
> >
> > [1] https://gitlab.com/strace/strace/-/raw/master/src/trie.h
> >
>
> I looked into xarray and it appears to be optimized for storing something,
> where as I'm just interested in a sparse bitmask.

I guess he suggested that store the bitmask in xarray. Anyway, both are
OK to me. This is needed for reducing the memory.

Thank you,

>
> Thanks for the review.
>
> Note, I'll be posting a v3 soon because I found if I echo 1<<30 into
> set_event_pid, it adds 0 (because it only looks at the bottom 30 bits).
> It should really return -EINVAL.
>
> -- Steve


--
Masami Hiramatsu <mhiramat@xxxxxxxxxx>