Re: [PATCH 2 of 4] Introduce i386 fibril scheduling

From: Linus Torvalds
Date: Mon Feb 05 2007 - 15:40:00 EST




On Mon, 5 Feb 2007, Davide Libenzi wrote:
>
> No, that's *really* it ;)
>
> The cookie you pass, and the return code of the syscall.
> If there other data transfered? Sure, but that data transfered during the
> syscall processing, and handled by the syscall (filling up a sys_read
> buffer just for example).

Indeed. One word is *exactly* what a normal system call returns too.

That said, normally we have a user-space library layer to turn that into
the "errno + return value" thing, and in the case of async() calls we
very basically wouldn't have that. So either:

- we'd need to do it in the kernel (which is actually nasty, since
different system calls have slightly different semantics - some don't
return any error value at all, and negative numbers are real numbers)

- we'd have to teach user space about the "negative errno" mechanism, in
which case one word really is alwats enough.

Quite frankly, I much prefer the second alternative. The "negative errno"
thing has not only worked really really well inside the kernel, it's so
obviously 100% superior to the standard UNIX "-1 + errno" approach that
it's not even funny.

To see why "negative errno" is better, just look at any threaded program,
or look at any program that does multiple calls and needs to save the
errno not from the last one, but from some earlier one (eg, it does a
"close()" in between returning *its* error, and the real operation that
we care about).

> Did I miss something? The async() syscall will allow (with few
> restrictions) to execute whatever syscall in an async fashion. An syscall
> returns a result code (long). Plus, you need to pass back the
> userspace-provided cookie of course.

HOWEVER, they get returned differently. The cookie gets returned
immediately, the system call result gets returned in-memory only after the
async thing has actually completed.

I would actually argue that it's not the kernel that should generate any
cookie, but that user-space should *pass*in* the cookie it wants to, and
the kernel should consider it a pointer to a 64-bit entity which is the
return code.

In other words, the only "cookie" we need is literally the pointer to the
results. And that's obviously something that the user space has to set up
anyway.

So how about making the async interface be:

// returns negative for error
// zero for "synchronous"
// positive kernel "wait for me" cookie for success
long sys_async_submit(
unsigned long flags,
long *user_result_ptr,
long syscall,
unsigned long *args);

and the "args" thing would literally just fill up the registers.

The real downside here is that it's very architecture-specific this way,
and that means that x86-64 (and other 64-bit ones) would need to have
emulation layers for the 32-bit ones, but they likely need to do that
*anyway*, so it's probably not a huge downside. The alternative is to:

- make a new architecture-independent system call enumeration for the
async interface

- make everything use 64-bit values.

Now, making an architecture-independent system call enumeration may
actually make sense regardless, because it would allow sys_async() to have
its own system call table and put the limitations and rules for those
system calls there, instead of depending on the per-architecture system
call table that tends to have some really architecture-specific details.

Hmm?

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/