Re: [PATCH] [RFC] tracehook: Hook in syscall tracing markers.
From: Roland McGrath
Date: Fri Sep 26 2008 - 06:43:23 EST
Sorry I've been slow in catching up on email threads after last week.
As to the arguments, the asm/syscall.h interface is there for that
(see the asm-generic/syscall.h comments). A tracepoint that passes
the regs argument through makes those easy to use. (Actually, it's
always the same as task_pt_regs(current), but since it's on hand in
an argument register anyway, might as well use it.)
Of course these are the generic "up to 6 register values that make up
syscall args", and nothing "generic" inside the kernel addresses the
subject of what the actual C types of each syscall's arguments are.
That is one benefit you get from specific tracepoints inside specific
calls--another is specificity of call, though at the expense of
specificity of which thread gets traced. I think that whole subject is
beyond the scope of what we mean by "a global syscall tracepoint", which
I think is all we're talking about here.
As Frank mentioned, the main problem to be solved here is how to arrange
that all threads gets into tracehook_report_syscall_entry/exit to begin
with, i.e. keep TIF_SYSCALL_TRACE set or equivalent.
I think adding TIF_KERNEL_TRACE is just a lousy idea. It does
constitute touching the asm of each arch by hand in subtle ways. On
some machines, there isn't a free bit in the range that's needed to
act equivalent to TIF_SYSCALL_TRACE. So I suspect Mathieu has some
arch patches "without changing anything of the assembly code except
adding one flag to the tested flags" for an arch or two that in fact
change the assembly code so it won't work (or won't even assemble).
After CONFIG_ARCH_TRACEHOOK, I really think we want--and can
have--exactly zero changes anywhere in assembly code related to this.
Every arch already implements two ways to get someplace hookable for
every syscall: TIF_SYSCALL_TRACE and TIF_SYSCALL_AUDIT. That's enough.
Currently TIF_SYSCALL_TRACE is only used by ptrace, which does no other
bookkeeping to remember whether it set the flag. It's trivial to make
ptrace not clear TIF_SYSCALL_TRACE when global tracing is enabled. It's
also trivial to set it on every thread to enable global tracing, and
then start new threads with it set. That leaves only some corner cases
about when you disable global tracing. It wouldn't be so hard to fiddle
with ptrace bookkeeping so you can get the bit cleared correctly without
either spurious syscall tracing reports or breaking PTRACE_SYSCALL
operations already in flight. (Incidentally, under utrace there is
already bookkeeping such that it's completely trivial to make it lazily
clear TIF_SYSCALL_TRACE after global tracing has been disabled and
interact with utrace/ptrace right.)
Back on the arguments, it's worth mentioning to be complete on the
record: The asm/syscall.h approach (extracting via pt_regs) is quite
cheap (inlined away to a word copy) on almost all machines, but is more
costly on ia64 (perhaps significantly so). On ia64, it may make a real
performance difference to have a tracepoint of some kind that passes the
arguments directly in its signature. Conversely, on other machines
these several arguments vs the one struct pt_regs * could well clutter
up the generated code around a tracepoint.
The audit hooks take direct syscall argument register value parameters
in this way, though only four of the possibly six argument registers.
This ties into the alternative of using TIF_SYSCALL_AUDIT instead. With
a differently-placed tracepoint (in auditsc.c), this could work either
with or without an actual audit context. On another tangent, if you
were to enable a normal audit context, this also gets you audit_getname,
which gets obliquely towards part of the "syscall arguments with types"
puzzle. Another caveat is that the audit path is not where a tracepoint
probe function can change the syscall args/# on entry or result on exit,
which only tracehook_report_syscall_entry/exit can safely do.
Thanks,
Roland
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/