Re: [PATCH v3 02/13] tracing: split out syscall_trace_enter construction

From: Will Drewry
Date: Thu Jun 02 2011 - 11:18:26 EST


On Thu, Jun 2, 2011 at 9:29 AM, Ingo Molnar <mingo@xxxxxxx> wrote:
>
> * Will Drewry <wad@xxxxxxxxxxxx> wrote:
>
[...]
>
>> Based on my observations while exploring the code, it appears that
>> the LSM security_* calls could easily become active trace events
>> and the LSM infrastructure moved over to use those as tracepoints
>> or via event_filters.  There will be a need for new predicates for
>> the various new types (inode *, etc), and so on.  However, the
>> trace_sys_enter/__secure_computing model will still be a special
>> case.
>
> Yes, and that special event will not go away!
>
> I did not suggest to *replace* those events with the security events.
> I suggested to *combine* them - or at least have a model that
> smoothly extends to those events as well and does not limit itself to
> the syscall surface alone.
>
> We'll want to have both.
>
> But by hardcoding to only those events, and creating a
> syscall-numbering special ABI, a wall will be risen between this
> implementation and any future enhancement to cover other events. My
> suggestion would be to use the event filter approach - that way
> there's not a wall but an open door towards future extensions ;-)

Yeah, I can definitely see that. We could have the prctl interface
take in the event id, but that introduces dependency on
CONFIG_PERF_EVENTS in addition
(to get the id exported) and means we'll have much more limited
coverage of syscalls until the syscall wrapping matures.

Could this be resolved in the proposed change by supporting both
mechanisms? Or is that just asking for trouble?

E.g., it could be an extra field:
prctl(PR_SET_SECCOMP_FILTER, PR_SECCOMP_FILTER_TYPE_EVENT, event_id,
filter_string);
prctl(PR_SET_SECCOMP_FILTER, PR_SECCOMP_FILTER_TYPE_SYSCALL,
__NR_somesyscall, filter_string);
[and the same for CLEAR_FILTER and GET_FILTER]

or even reserve negative values for event ids and positive for
syscalls (which feels more hackish). Adding event_id support wouldn't
be much more additional code (since it's just a layer of
dereferencing). Since there will likely be syscall-indexed entry
behavior no matter what (like there is for ftrace/perf_sysenter), it
won't necessarily be a large diversion in the future either.

If not, seccomp_filter could depend on both FTRACE_SYSCALLS and
exported PERF_EVENTS (or make "id"s not perf_event specific), then it
could just use the sys_enter event ids. Doing so does have some other
properties that I'm not as fond of, like requiring debugfs to be
compiled in, mounted, and readable by the caller in order to construct
a filterset, so I can still see some benefit for the syscall number
use in some cases (much easier to deploy on a server without debugfs
access, etc). Right now, having both interfaces doesn't really give
us anything, but having the field set aside for future exploration
isn't necessarily a bad thing!

What do you think? Would a change to support both be too crazy/dumb or
just crazy/dumb enough? Or do you see another path that could avoid
isolating any current work from a more fruitful future?

thanks!
will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/