Re: System call instrumentation

From: Mathieu Desnoyers
Date: Mon May 19 2008 - 23:45:41 EST

* Ingo Molnar (mingo@xxxxxxx) wrote:
> * Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxx> wrote:
> > Ideally, I'd like to have this kind of high-level information :
> >
> > event name : kernel syscall
> > syscall name : open
> > arg1 (%s) : "somefile" <-----
> > arg2 (%d) : flags
> > arg3 (%d) : mode
> >
> > However, "somefile" has to be read from userspace. With the protection
> > involved, it would cause a performance impact to read it a second time
> > rather than tracing the string once it's been copied to kernel-space.
> performance is a secondary issue here, and copies are fast anyway _if_
> someone wants to trace a syscall. (because the first copy brings the
> cacheline into the cache, subsequent copies are almost for free compared
> to the first copy)
> Ingo

Hrm, a quick benchmark on my pentium 4 comparing a normal open() system
call executed in a loop to a modified open() syscall which executes the
lines added in the following patch adds 450 cycles to each open() system
call. I added a putname/getname on purpose to see the cost of a second
userspace copy and it's not exactly free.

The normal getname correctly nested, re-using the string previously
copied, should not suffer from that kind of performance hit. Also, given
that the string would be copied only once from userspace, it would
eliminate race scenarios where multithreaded applications could change
the string underneath, so the kernel would trace a different string than
the one being really used for the system call.

However, strings are not the only userspace arguments passed to system
calls. For all these other arguments, performance could be an issue as
well as racy user-level data modification which would let the kernel
trace a different paramenter than the one being used in the system call.

For those two reasons, I think extracting these parameters could be
faster/cleaner/safer if done in the system call function, where the
parameters are already copied in kernel space.


Index: linux-2.6-lttng/fs/open.c
--- linux-2.6-lttng.orig/fs/open.c 2008-05-19 22:51:16.000000000 -0400
+++ linux-2.6-lttng/fs/open.c 2008-05-19 23:11:07.000000000 -0400
@@ -1043,6 +1043,8 @@ long do_sys_open(int dfd, const char __u
int fd = PTR_ERR(tmp);

if (!IS_ERR(tmp)) {
+ putname(tmp);
+ tmp = getname(filename);
fd = get_unused_fd_flags(flags);
if (fd >= 0) {
struct file *f = do_filp_open(dfd, tmp, flags, mode);

Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at