Re: [PATCH 1/2] perf: Add closing sibling events' file descriptors

From: Andi Kleen
Date: Tue Aug 11 2020 - 18:15:58 EST


On Tue, Aug 11, 2020 at 07:21:13PM +0300, Alexander Shishkin wrote:
> Andi Kleen <ak@xxxxxxxxxxxxxxx> writes:
>
> > On Tue, Aug 11, 2020 at 12:47:24PM +0300, Alexander Shishkin wrote:
> >>
> >> Right, but which bytes? One byte per event? That's
> >> arbitrary. sizeof(struct perf_event)? Then, probably also sizeof(struct
> >> perf_event_context).
> >
> > Yes the sum of all the sizeofs needed for a perf_event.
>
> Well, *all* of them will be tedious to collect, seeing as there is
> ctx->task_ctx_data, there is ring_buffer, scheduling trees, there is
> stuff that pmus allocate under the hood, like AUX SG tables.

Well I'm sure we can figure something out. I guess it doesn't need to be
fully accurate, just approximate enough, and be bounded.

>
> >> The above two structs add up to 2288 bytes on my local build. Given the
> >> default RLIMIT_MEMLOCK of 64k, that's 28 events. As opposed to ~1k
> >> events if we keep using the RLIMIT_NOFILE. Unless I'm missing your
> >> point.
> >
> > Yes that's true. We would probably need to increase the limit to a few
> > MB at least.
>
> Ok, but if we have to increase a limit anyway, we might as well increase
> the NOFILE.

NFILE is a terrible limit because it's really large factor * NFILE for
DoS. Also I suspect there will be many cases where the kernel default
is not used.

But yes I suspect it should be increased, not just for perf, but
for other use cases. AFAIK pretty much every non trivial network
server has to change it.

>
> > Or maybe use some combination with the old rlimit for compatibility.
> > The old rlimit would give an implicit extra RLIMIT_NFILE * 2288 limit
> > for RLIMIT_MEMLOCK. This would only give full compatibility for a single
> > perf process, but I suspect that's good enough for most users.
>
> We'd need to settle on charging a fixed set of structures per event,
> then. And, without increasing the file limit, this would still total at
> 1052 events.

True. For perf we really would like a limit that scales with the number
of CPUs.

>
> We could also involve perf_event_mlock_kb *and* increase it too, but I
> suspect distros don't just leave it at kernel's default either.

I haven't seen any distribution that changed it so far.

-Andi