Re: [PATCH 4/6] trace: trace syscall in its handler not from ptracehandler

From: H. Peter Anvin
Date: Wed Mar 28 2012 - 23:15:33 EST

On 03/28/2012 07:59 PM, Steven Rostedt wrote:
> On Wed, 2012-03-28 at 19:43 -0700, H. Peter Anvin wrote:
>> The syscall interface is the single most stable interface in the kernel.
>> Just plunk down the system call number and the six arguments in the
>> buffer, and be done with it. On the way out, there is a single return
>> argument, *by design*. No need to burden the kernel in this way! That
>> this information can be perfectly well decoded in userspace is already
>> shown by strace, although it would be highly beneficial if the kernel
>> build could export information to strace and other tools. There is
>> absolutely no need for it to live in kernel memory, though.
> Even if it did live in kernel memory (which it does now, and I'm not
> sure if we can change it due to the *don't break existing tools* law).
> We should be able to at least compress it so that it doesn't waste as
> much memory.

This whole facility is the logical equivalent of doing binary-to-ascii
conversion with a switch statement:

switch (foo)
case 0:

case 1:

case 2:

/* ... */

We see that kind of code on The Daily WTF all the time, but it has no
excuse being seen anywhere close to the Linux kernel.

Furthermore, if we can't even fix grotesque brokenness like this in
*debugging tools*, then we might as well go home, as there is absolutely
no hope to ever make forward progress. This is worse than "let's pick
up a bunch of random kernel internals and make them stable ABIs" Xen.


