Re: [PATCH v3 00/35] Memory allocation profiling
From: Kent Overstreet
Date: Wed Feb 14 2024 - 15:00:42 EST
On Wed, Feb 14, 2024 at 11:24:23AM -0800, Suren Baghdasaryan wrote:
> On Wed, Feb 14, 2024 at 9:52 AM Kent Overstreet
> <kent.overstreet@xxxxxxxxx> wrote:
> >
> > On Wed, Feb 14, 2024 at 08:55:48AM -0800, Andrew Morton wrote:
> > > On Tue, 13 Feb 2024 14:59:11 -0800 Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote:
> > >
> > > > > > If you think you can easily achieve what Michal requested without all that,
> > > > > > good.
> > > > >
> > > > > He requested something?
> > > >
> > > > Yes, a cleaner instrumentation. Unfortunately the cleanest one is not
> > > > possible until the compiler feature is developed and deployed. And it
> > > > still would require changes to the headers, so don't think it's worth
> > > > delaying the feature for years.
> > >
> > > Can we please be told much more about this compiler feature?
> > > Description of what it is, what it does, how it will affect this kernel
> > > feature, etc.
> > >
> > > Who is developing it and when can we expect it to become available?
> > >
> > > Will we be able to migrate to it without back-compatibility concerns?
> > > (I think "you need quite recent gcc for memory profiling" is
> > > reasonable).
> > >
> > >
> > >
> > > Because: if the maintainability issues which Michel describes will be
> > > significantly addressed with the gcc support then we're kinda reviewing
> > > the wrong patchset. Yes, it may be a maintenance burden initially, but
> > > at some (yet to be revealed) time in the future, this will be addressed
> > > with the gcc support?
> >
> > Even if we had compiler magic, after considering it more I don't think
> > the patchset would be improved by it - I would still prefer to stick
> > with the macro approach.
> >
> > There's also a lot of unresolved questions about whether the compiler
> > approach would even end being what we need; we need macro expansion to
> > happen in the caller of the allocation function
>
> For the record, that's what this attribute will be doing. So it should
> cover our usecase.
That wasn't clear in the meeting we had the other day; all that was
discussed there was the attribute syntax, as I recall.
So say that does work out (and I don't think that's a given; if I were a
compiler person I don't think I'd be interested in this strange half
macro, half inline function beast); all that has accomplished is to get
rid of the need for the renaming - the _noprof() versions of functions.
So then how do you distinguish where in the callstack the accounting
happens?
If you say "it happens at the outermost wrapper", then what happens is
- Extra overhead for all the inner wrapper invocations, where they have
to now check "actually, we already have an alloc tag, don't do
anything". That's a cost, and given how much time we spent shaving
cycles and branches during development it's not one we want.
- Inner allocations that shouldn't be accounted to the outer context
are now a major problem, because they silently will be accounted
there and never noticed.
With our approach, inner allocations are by default (i.e. when we
haven't switched them to the _noprof() variant) accounted to their
own alloc tag; that way, when we're reading the /proc/allocinfo
output, we can examine them and check if they should be collapsed to
the outer context. With this approach they won't be seen.
So no, we still don't want the compiler approach.