Re: [PATCH 2 of 4] Introduce i386 fibril scheduling

From: Linus Torvalds
Date: Fri Feb 02 2007 - 15:15:05 EST

Next message: Douglas Gilbert: "Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR"
Previous message: Matt Mackall: "Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR"
In reply to: Alan: "Re: [PATCH 2 of 4] Introduce i386 fibril scheduling"
Next in thread: Davide Libenzi: "Re: [PATCH 2 of 4] Introduce i386 fibril scheduling"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Fri, 2 Feb 2007, Alan wrote:
>
> This one got shelved while I sorted other things out as it warranted a
> longer look. Some comments follow, but firstly can we please bury this
> "fibril" name. The constructs Zach is using appear to be identical to
> co-routines, and they've been called that in computer science literature
> for fifty years. They are one of the great and somehow forgotten ideas.
> (and I admit I've used them extensively in past things where its
> wonderful for multi-player gaming so I'm a convert already).

Well, they are indeed coroutines, but they are coroutines in the same
sense any "CPU scheduler" ends up being a coroutine.

They are NOT the generic co-routine that some languages support natively.
So I think trying to call them coroutines would be even more misleading
than calling them fibrils.

In other workds the whole *point* of the fibril is that you can do
"coroutine-like stuff" while using a "normal functional linear programming
paradign".

Wouldn't you agree?

(I love the concept of coroutines, but I absolutely detest what the code
ends up looking like. There's a good reason why people program mostly in
linear flow: that's how people think consciously - even if it's obviously
not how the brain actually works).

And we *definitely* don't want to have a coroutine programming interface
in the kernel. Not in C.

> The stuff however isn't as free as you make out. Current kernel logic
> knows about various things being "safe" but with fibrils you have to
> address additional questions such as "What happens if I issue an I/O and
> change priority". You also have an 800lb gorilla hiding behind a tree
> waiting for you in priviledge and permission checking.

This is why I think it should be 100% clear that things happen in process
context. That just answers everything. If you want to synchronize with
async events and change IO priority, you should do exactly that:

wait_for_async();
ioprio(newprority);

and that "solves" that problem. Leave it to user space.

> Right now current->*u/gid is safe across a syscall start to end, with an
> asynchronous setuid all hell breaks loose. I'm not saying we shouldn't do
> this, in fact we'd be able to do some of the utterly moronic poxix thread
> uid handling in kernel space if we did, just that it isn't free. We have
> locking rules defined by the magic serializing construct called
> "the syscall" and you break those.

I agree. As mentioned, we probably will have fallout.

> The number of co-routines and stacks can be dealt with two ways - you use
> small stacks allocated when you create a fibril, or you grab a page, use
> separate IRQ stacks and either fail creation with -ENOBUFS etc which
> drops work on user space, or block (for which cases ??) which also means
> an overhead on co-routine exits. That can be tunable, for embedded easily
> tuned right down.

Right. It should be possible to just say "use a max parallelism factor of
5", and if somebody submits a hundred AIO calls and they all block, when
it hits #6, it will just do it synchronously.

Basically, what I'm hoping can come out of this (and this is a simplistic
example, but perhaps exactly *because* of that it hopefully also shows
that we canactually make *simple* interfaces for complex asynchronous
things):

struct one_entry *prev = NULL;
struct dirent *de;

while ((de = readdir(dir)) != NULL) {
struct one_entry *entry = malloc(..);

/* Add it to the list, fill in the name */
entry->next = prev;
prev = entry;
strcpy(entry->name, de->d_name);

/* Do the stat lookup async */
async_stat(de->d_name, &entry->stat_buf);
}
wait_for_async();
.. Ta-daa! All done ..

and it *should* allow us to do all the stat lookup asynchronously.

Done right, this should basically be no slower than doing it with a real
stat() if everything was cached. That would kind of be the holy grail
here.

> You get some other funny things from co-routines which are very powerful,
> very dangerous, or plain insane

You forgot "very hard to think about".

We DO NOT want coroutines in general. It's clever, but it's
(a) impossible to do without language support that C doesn't have, or
some really really horrid macro constructs that really only work for
very specific and simple cases.
(b) very non-intuitive unless you've worked with coroutines a lot (and
almost nobody has)

> Linus will now tell me I'm out of my tree...

I don't think you're wrong in theory, I just thnk that in practice,
withing the confines of (a) existing code, (b) existing languages, and (c)
existing developers, we really REALLY don't want to expose coroutines as
such.

But if you wanted to point out that what we want to do is get the
ADVANTAGES of coroutines, without actually have to program them as such,
then yes, I agree 100%. But we shouldn't call them coroutines, because the
whole point is that as far as the user interface is concerned, they don't
look like that. In the kernel, they just look like normal linear
programming.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Douglas Gilbert: "Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR"
Previous message: Matt Mackall: "Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR"
In reply to: Alan: "Re: [PATCH 2 of 4] Introduce i386 fibril scheduling"
Next in thread: Davide Libenzi: "Re: [PATCH 2 of 4] Introduce i386 fibril scheduling"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]