Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3

From: Evgeniy Polyakov
Date: Sun Feb 25 2007 - 13:14:01 EST


On Sun, Feb 25, 2007 at 06:45:05PM +0100, Ingo Molnar (mingo@xxxxxxx) wrote:
>
> * Evgeniy Polyakov <johnpol@xxxxxxxxxxx> wrote:
>
> > My main concern was only about the situation, when we ends up with
> > truly bloking context (like network), and this results in having
> > thousands of threads doing the work - even having most of them
> > sleeping, there is a problem with memory overhead and context
> > switching, although it is usable situation, but when all of them are
> > ready immediately - context switching will kill a machine even with
> > O(1) scheduler which made situation damn better than before, but it is
> > not a cure for the problem.
>
> yes. This is why in the original fibril discussion i concentrated so
> much on scheduling performance.
>
> to me the picture is this: conceptually the scheduler runqueue is a
> queue of work. You get items queued upon certain events, and they can
> unqueue themselves. (there is also register context but that is already
> optimized to death by hardware) So whatever scheduling overhead we have,
> it's a pure software thing. It's because we have priorities attached.
> It's because we have some legacies. Etc., etc. - it's all stuff /we/
> wanted to add, but nothing truly fundamental on top of the basic 'work
> queueing' model.
>
> now look at kevents as the queueing model. It does not queue 'tasks', it
> lets user-space queue requests in essence, in various states. But it's
> still the same conceptual thing: a memory buffer with some state
> associated to it. Yes, it has no legacies, it has no priorities and
> other queueing concepts attached to it ... yet. If kevents got
> mainstream, it would get the same kind of pressure to grow 'more
> advanced' event queueing and event scheduling capabilities.
> Prioritization would be needed, etc.
>
> So my fundamental claim is: a kernel thread /is/ our main request
> structure. We've got tons of really good system calls that queue these
> 'requests' around the place and offer functionality around this concept.
> Plus there's a 1.2+ billion lines of Linux userspace code that works
> well with this abstraction - while there's nary a few thousand lines of
> event-based user-space code.
>
> I also say that you'll likely get kevents outperform threadlets. Maybe
> even significantly so under the right conditions. But i very much
> believe we want to get similar kind of performance out of thread/task
> scheduling, and not introduce a parallel framework to do request
> scheduling the hard way ... just because our task concept and scheduling
> implementation got too fat. For the same reason i didnt really like
> fibrils: they are nice, and Zach's core idea i think nicely survived in
> the syslet/threadlet model too, but they are more limited than true
> threads. So doing that parallel infrastructure, which really just
> implements the same, and is only faster because it skips features, would
> just be hiding the problem with our primary abstraction. Ok?

Kevent is a _very_ small entity and there is _no_ cost of requeueing
(well, there is list_add guarded by lock) - after it is done, process
can start real work. With rescheduling there are _too_ many things to be
done before we can start new work. We have to change registers, change
address space, various tlb bits and so on - we have to do it, since task
describes very heavy entity - the whole process.
IO in turn is a very small subset of what process is (can do), so there
is no need to change the whole picture, so it is enough to have one
process, which does the work.

Threads are a bit smaller than process, but still it is too heavy to
have it per IO - so we have pools - this decreases rescheduling
overhead, but limits parallelism.

I think it is _too_ heavy to have such a monster structure like
task(thread/process) and related overhead just to do an IO.

> Ingo

--
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/