Re: [PATCH 1/8] perf: Allow to block process in syscall tracepoints

From: Peter Zijlstra
Date: Mon Dec 10 2018 - 05:18:43 EST


On Sat, Dec 08, 2018 at 12:38:05PM -0500, Steven Rostedt wrote:
> On Sat, 8 Dec 2018 11:44:23 +0100
> Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

> > > Why do we care about lost events? Because strace records *all* events,
> > > as that's what it does and that's what it always has done. It would be
> > > a break in functionality (a regression) if it were to start losing
> > > events. I use strace to see everything that an application is doing.
> >
> > So make a new tool; break the expectation of all events. See if there's
> > anybody that really cares.
>
> Basically you are saying, break strace and see if anyone notices?

Nah, give it a new name. Clearly mark this is a new tool.

> > > When we discussed this at plumbers, Oracle people came to me and said
> > > how awesome it would be to run strace against their database accesses.
> > > The problem today is that strace causes such a large overhead that it
> > > isn't feasible to trace any high speed applications, especially if
> > > there are time restraints involved.
> >
> > So have them run that perf thing acme pointed to.
> >
> > So far nobody's made a good argument for why we cannot have LOST events.
>
> If you don't see the use case, I'm not sure anyone can convince you.
> Again, I like the fact that when I do a strace of an application I know
> that all system calls that the application I'm tracing is recorded. I
> don't need to worry about what happened in the "lost events" space.

You're the one pushing for this crap without _any_ justification. Why
are you getting upset if I ask for some?

If people care so much, it shouldn't be hard to write up a coherent
story on this, so far all I seem to get is: because it's always been
like that.

Which really isn't much of an argument.