Re: [PATCH v2 0/6] perf: Introduce extended syscall error reporting

From: Ingo Molnar
Date: Fri Aug 28 2015 - 06:08:14 EST

* Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:

> On Wed, 26 Aug 2015 22:05:13 +0200 Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > > Is this whole thing overkill? As far as I can see, the problem which is
> > > being addressed only occurs in a couple of places (perf, wifi netlink
> > > handling) and could be addressed with some local pr_debug statements. ie,
> > >
> > > #define err_str(e, s) ({
> > > if (debugging)
> > > pr_debug("%s:%d: error %d (%s)", __FILE__, __LINE__, e, s);
> > > e;
> > > })
> > >
> > > (And I suppose that if this is later deemed inadequate, err_str() could
> > > be made more fancy).
> >
> > Not really. That is something that's limited to root. Whereas the
> > problem is very much wider than that.
> >
> > If you set one bit wrong in the pretty large perf_event_attr you've got
> > a fair chance of getting -EINVAL on trying to create the event. Good
> > luck finding what you did wrong.
> >
> > Any user can create events (for their own tasks), this does not require
> > root.
> >
> > Allowing users to flip your @debugging flag would be an insta DoS.
> >
> > Furthermore, its very unfriendly in that you have to (manually) go
> > correlate random dmesg output with some program action.
> It depends on who the audience is. If it's developers who are writing userspace
> perf tooling then all the above won't be an issue. If it's aimed at end users
> of that tooling then yes.
> IOW, we're in the usual situation of discussing implementation before anyone has
> explained the requirements.

So the perf background was well understood by most people involved, it just didn't
survive into the 0/N description:

The problem is that we have a complex attribute structure with dozens of user
triggerable (and often hardware dependent) failure scenarios all returning one of
-EINVAL or -ENOTSUPP. Likewise there's a similarly complex scheduler attribute
structure handled by SyS_sched_setattr() with 10+ failure modes.

So since the kernel actually knows exactly what the failure was, and we lose that
information due to errno clustering, we thought it brilliant idea to try to be
helpful to human users of the tooling and to attempt to preserve this
information - to make Linux tooling a bit less passive-aggressive than it is
today. (Or at least those parts of tooling that we are writing!)

The other option would be to replicate all the failure analysis in user-space -
which sucks and which it cannot even do in some important cases.

The third option is to maintain the status quo: let Linux tooling continue to suck
wrt. failure analysis.

> Also... we're talking only of perf, so perhaps some perf-specific reporting
> scheme would be better, rather than a kernel-wide thing.

That was indeed the starting point, at which point scheduler syscalls came up, and
potentially other places in the kernel were mentioned, so we thought we'd try to
be more generally useful.

To address the inevitable "why did you code this up in a perf-specific way??!"
complaints and such ;-)


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at