Re: exclusive cpusets broken with cpu hotplug

From: Nick Piggin
Date: Thu Oct 19 2006 - 03:05:09 EST


Paul Jackson wrote:
Nick wrote:

I don't understand why you think the "implicit" (as in, not directly user
controlled?) linkage is wrong.


Twice now I've given the following specific example. I am not yet
confident that I have it right, and welcome feedback.

Sorry, I skimmed over that.


However, Suresh has apparently agreed with my conclusion that one
can use the current linkage between cpu_exclusive cpusets and sched
domains to get unexpected and perhaps undesirable sched domain setups.

What's your take on this example:


Example:

As best as I can tell (which is not very far ;), if some hapless
user does the following:

/dev/cpuset cpu_exclusive == 1; cpus == 0-7
/dev/cpuset/a cpu_exclusive == 1; cpus == 0-3
/dev/cpsuet/b cpu_exclusive == 1; cpus == 4-7

and then runs a big job in the top cpuset (/dev/cpuset), then that
big job will not load balance correctly, with whatever threads
in the big job that got stuck on cpus 0-3 isolated from whatever
threads got stuck on cpus 4-7.

Is this correct?


If I have concluded incorrectly what happens in the above example
(good chance) then please educate me on how this stuff works.

So that depends on what cpusets asks for. If, when setting up a and
b, it asks to partition the domains, then yes that breaks the parent
cpuset gets broken.

I should warn you that I have demonstrated a remarkable resistance
to being educatible on this subject ;).

Don't worry about the whole sched-domains implementation if you just
consider that partitioning the domains creates a hard partition
among the system's CPUs (but the upshot is that within the partitions,
balancing works pretty nicely).

So in your above example, cpusets should only ask for a partition of
the 0-7 CPUs.

If you wanted to get fancy and detect that there are no jobs in the
root cpuset, then you could make the two smaller partitions, and revert
back to the one bigger one if something gets assigned to it.

But that's all a matter of how you want cpusets to manage it, I really
don't think a user should control this (we simply shouldn't allow
situations where we put a partition in the middle of a cpuset).

Thanks,
Nick

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com -
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/