Re: [PATCH v2 0/6] perf: Introduce extended syscall error reporting

From: Ingo Molnar
Date: Tue Aug 25 2015 - 06:07:36 EST



* Johannes Berg <johannes@xxxxxxxxxxxxxxxx> wrote:

> On Tue, 2015-08-25 at 11:17 +0200, Ingo Molnar wrote:
> >
> > If we do that then we don't even have to introduce per system call error code
> > conversion, but could unconditionally save the last extended error info in the
> > task struct and continue - this could be done very cheaply with the linker trick
> > driven integer ID.
> >
> > I.e. system calls could opt in to do:
> >
> > > return err_str(-EBUSY, "perf/x86: BTS conflicts with active events");
> >
> > and the overhead of this would be minimal, we'd essentially do something like this
> > to save the error:
> >
> > > current->err_code = code;
> >
> > where 'code' is a build time constant in essence.
> >
> > We could use this even in system calls where the error path is performance
> > critical, as all the string recovery and copying overhead would be triggered by
> > applications that opt in via the new system call:
> >
> > > struct err_desc {
> > > const char *message;
> > > const char *owner;
> > > const int code;
> > > };
> >
> > > SyS_err_get_desc(struct err_desc *err_desc __user);
> >
> > [ Which could perhaps be a prctl() extension as well (PR_GET_ERR_DESC): finally
> > some truly matching functionality for prctl(). ]
> >
> > Hm?
>
> That's neat in a way, but doesn't work in general I think.

Ok, I see the netlink problem - but it would work in the perf and scheduler cases,
except for the small wart that it's not signal safe by default. (Apps could either
save/restore it themselves in their signal handlers, via PR_SET_ERR_DESC, or we
could extend the signal frame with the code.)

Having a separate syscall has two (big!) appeals:

- we wouldn't have to touch existing system calls at all.

- extended error reporting would be available for any system call that opts to
use it. (The current scheme as submitted is only available to system calls
using the perf-style flexible attribute ABI.)

Regarding netlink:

> Considering the wifi case, or more generally any netlink based
> protocol, the syscall (sendmsg) won't return an error, but a subsequent
> recvmsg() (which also won't return an error) returns an error message
> [in the sense of a protocol message, not a human readable message] to a
> buffer provided by the application.
> However, this message can be extended relatively easily to include the
> string information, but the syscall/prctl wouldn't work since the
> syscalls didn't actually fail.

Ok. So assuming we can make a 1:1 mapping between the 'extended error code'
integer space and the message:owner strings, it would be enough for netlink to
pass along the integer code itself, not the full strings?

That would simplify things and make the scheme more robust from a security POV I
suspect.

> However, it could possibly help with the namespace/module issue if you
> also store THIS_MODULE (or perhaps instead a pointer to the module's
> error table) in the task. Again not in the netlink case though, I
> think, that will always require special handling [although there it
> could be stored away in the socket or so, similar to the task]

So my hope would be that we can represent this all with a single 'large' error
code integer space. That integer would be constant and translateable (as long as
the module is loaded).

That way the error passing mechanism wouldn't have to be specifically module-aware
- during build we generate the integer space, with all possible modules
considered.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/