Re: System call instrumentation

From: Mathieu Desnoyers
Date: Thu May 22 2008 - 08:48:00 EST


* Arjan van de Ven (arjan@xxxxxxxxxxxxx) wrote:
> On Mon, 19 May 2008 23:44:53 -0400
> Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxx> wrote:
>
> > * Ingo Molnar (mingo@xxxxxxx) wrote:
> > >
> > > * Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxx> wrote:
> > >
> > > > Ideally, I'd like to have this kind of high-level information :
> > > >
> > > > event name : kernel syscall
> > > > syscall name : open
> > > > arg1 (%s) : "somefile" <-----
> > > > arg2 (%d) : flags
> > > > arg3 (%d) : mode
> > > >
> > > > However, "somefile" has to be read from userspace. With the
> > > > protection involved, it would cause a performance impact to read
> > > > it a second time rather than tracing the string once it's been
> > > > copied to kernel-space.
>
> the audit subsystem already does all of this... why not use that??
> (And it goes through great lengths to do it securely)
>
> > >
>
> > Hrm, a quick benchmark on my pentium 4 comparing a normal open()
> > system call executed in a loop to a modified open() syscall which
> > executes the lines added in the following patch adds 450 cycles to
> > each open() system call. I added a putname/getname on purpose to see
> > the cost of a second userspace copy and it's not exactly free.
>
> copying twice does mean that if the user wants, he can cheat you. He
> can, in another thread, change the string under you. So say you're
> doing this for anti-virus purposes, he can make you scan one file and
> open another.
>
>
> The audit subsystem was carefully designed to avoid this trap... how
> about using that?

Hrm, given tracing will have to grab __user * parameters passed to
various system calls, not limited to strings, the getname/putname
infrastructure would need to be expanded a lot. I doubt it's worth
adding such complexity (copy to temporary memory buffers and reference
counting) in those system calls to support kernel-wide tracing.

On the other hand, adding a marker in the traced function, at a code
location where the data copied into the kernel is accessible, won't add
such complexity and will help to keep good locality of reference (the
stack is meant to be a good cache-hot memory region). Because a dormant
marker does not have a significant performance hit (actually, my
benchmarks shows a small acceleration of the overall system, probably
due to cache line code layout modifications), I think it's legitimate to
add this kind of instrumentation in the existing kernel system call
functions.

Mathieu

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/