Re: New NUMA scheduler and hotplug CPU

From: Martin J. Bligh
Date: Mon Jan 26 2004 - 19:11:37 EST


> Call me crazy, but why not let the topology be determined via userspace at a
> more appropriate time? When you hotplug, you tell it where in the scheduler
> to plug it. Have structures in the scheduler which represent the
> nodes-runqueues-cpus topology (in the past I tried a node/rq/cpu structs with
> simple pointers), but let the topology be built based on user's desires thru
> hotplug.

Well, I agree with the "at a more appropriate time" bit. But there's no
real need to make a bunch of complicated stuff out in userspace for this -
we're trying to lay out the scheduler domains according to the hardware
topology of the machine. It's not a userspace namespace or anything.
Having userspace fishing down way deep in hardware specific stuff is
silly - the kernel is there as a hardware abstraction layer.

Now if you wanted to use sched domains for workload management or something
and involve userspace, then yes ... that'd be more appropriate.

> For example, you boot on just the boot cpu, which by default is in the first
> node on the first runqueue. All other cpus, whether being "booted" for the
> for the first time or hotplugged (maybe now there's really no difference),
> the hotplugging tells where the cpu should be, in what node and what
> runqueue. HT cpus work even better, because you can hotplug siblings, once
> at a time if you wanted, to the same runqueue. Or you have cpus sharing a
> die, same thing, lots of choices here. This removes any per-arch updates to
> the kernel for things like scheduler topology, and lets them go somewhere
> else more easily changes, like userspace.

Ummm ... but *none* of that is dictated as policy stuff - it's all just
the hardware layout of the machine. You cannot "decide" as the sysadmin
which node a CPU is in, or which HT sibling it has. It's just there ;-)
The only thing you could possibly dictate is the CPU number you want
assigned to the new CPU, which frankly, I think is pointless - they're
arbitrary tags, and always have been.

M.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/