Re: [Lse-tech] [PATCH] cpusets - big numa cpu and memory placement
From: Martin J. Bligh
Date: Thu Oct 07 2004 - 13:28:16 EST
>> So we have the purely exclusive stuff, which needs kernel support in the form
>> of sched_domains alterations. The rest of cpusets is just poking and prodding
>> at cpus_allowed, the membind API, and the irq binding stuff. All of which
>> you could do from userspace, without any further kernel support, right?
>> Or am I missing something?
>
> Well ... we're gaining. A couple of days ago you were suggesting
> that cpusets could be replaced with some exclusive domains plus
> CKRM.
>
> Now it's some exclusive domains plus poking the affinity masks.
>
> Yes - you're still missing something.
>
> But I must keep in mind that I had concluded, perhaps three years ago,
> just what you conclude now: that cpusets is just poking some affinity
> masks, and that I could do most of it from user land. The result ended
> up missing some important capabilities. User level code could not
> manage collections of hardware nodes (sets of CPUs and Memory Nodes) in
> a co-ordinated and controlled manner.
>
> The users of cpusets need to have system wide names for them, with
> permissions for viewing, modifying and attaching to them, and with the
> ability to list both what hardware (CPUs and Memory) in a cpuset, and
> what tasks are attached to a cpuset. As is usual in such operating
> systems, the kernel manages such system wide synchronized controlled
> access views.
>
> As I quote below, I've been saying this repeatedly. Could you
> tell me, Martin, whether the disconnect is:
> 1) that you didn't yet realize that cpusets provided this model (names,
> permissions, ...) or
> 2) you don't think such a model is useful, or
> 3) you think that such a model can be provided sensibly from user space?
>
> If I knew this, I could focus my response better.
>
> The rest of this message is just quotes from this last week - many
> can stop reading here.
My main problem is that I don't think we want lots of overlapping complex
interfaces in the kernel. Plus I think some of the stuff proposed is fairly
klunky as an interface (physical binding where it's mostly not needed, and
yes I sort of see your point about keeping jobs on separate CPUs, though I
still think it's tenuous), and makes heavy use of stuff that doesn't work
well (e.g. cpus_allowed). So I'm searching for various ways to address that.
The purely exclusive parts of cpusets can be implemented in a much nicer
manner inside the kernel, by messing with sched_domains, instead of just
using cpus_allowed as a mechanism ... so that seems like much less of a
problem.
The non-exclusive bits seem to overlap heavily with both CKRM and what
could be done in userspace. I still think the physical stuff is rather
obscure, and binding stuff to specific CPUs is an ugly way to say "I want
these two threads to not run on the same CPU". But if we can find some
other way (eg userspace) to allow you to do that should you utterly insist
on doing so, that'd be a convenient way out.
As for the names and permissions issue, both would be *doable* from
userspace, though maybe not as easily as in-kernel. Names would probably
be less hassle than permissions, but neither would be impossible, it seems.
It all just seems like a lot of complexity for a fairly obscure set of
requirements for a very limited group of users, to be honest. Some bits
(eg partitioning system resources hard in exclusive sets) would seem likely
to be used by a much broader audience, and thus are rather more attractive.
But they could probably be done with a much simpler interface than the whole
cpusets (BTW, did that still sit on top of PAGG as well, or is that long
gone?)
M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/