Re: [PATCH 15/15] x86: Fix cpu_coregroup_mask to return correctcpumask on multi-node processors

From: Ingo Molnar
Date: Tue Aug 25 2009 - 06:37:03 EST



* Andreas Herrmann <andreas.herrmann3@xxxxxxx> wrote:

> On Mon, Aug 24, 2009 at 08:21:54PM +0200, Ingo Molnar wrote:
> >
> > * Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >
> > > On Thu, 2009-08-20 at 15:46 +0200, Andreas Herrmann wrote:
> > > > The correct mask that describes core-siblings of an processor
> > > > is topology_core_cpumask. See topology adapation patches, especially
> > > > http://marc.info/?l=linux-kernel&m=124964999608179
> > >
> > > argh, violence, murder kill.. this is the worst possible hack and
> > > you're extending it :/
> >
> > I think most of the trouble here comes from having inconsistent
> > names, a rather static structure for sched-domains setup and
> > then we are confusing things back and forth.
> >
> > Right now we have thread/sibling, core, CPU/socket and node,
> > with many data structures around these hardcoded. Certain
> > scheduler features only operate on the hardcoded fields.
> >
> > Now Magny-Cours adds a socket internal node construct to the
> > whole thing, names it randomly and basically breaks the
> > semi-static representation.
> >
> > We cannot just flip around our static names and hope it goes
> > well and everything just drops into place. Everything just falls
> > apart really instead.
> >
> > Instead we should have an arch-defined tree and a CPU
> > architecture dependent ASCII name associated with each level -
> > but not hardcoded into the scheduler.
>
> I admit that it's strange to have the x86 specific SCHED_SMT/MC
> snippets in common code.
>
> And the NUMA/SD_NODE stuff is not used by all architectures
> either.
>
> Having an arch-defined tree seems the right thing to do.

yep, with generic helpers to reduce per arch bloat.
(named/structured in a neutral way)

> > Plus we should have independent scheduler domains feature flags
> > that can be turned on/off in various levels of that tree,
> > depending on the cache and interconnect properties of the
> > hardware - without having to worry about what the ASCII name
> > says. Those features should be capable to work not just on the
> > lowest level of the tree, but on higher levels too, regardless
> > whether that level is called a 'core', a 'socket' or an
> > 'internal node' on the ASCII level really.
> >
> > This is why i insisted on handling the Magny-Cours topology
> > discovery and enumeration patches together with the scheduler
> > patches. It can easily become a mess if extended.
>
> I don't buy this argument.
>
> The main source of information when building sched-domains will be
> the CPU topology. That must be provided somehow independent of how
> scheduling domains are created. When the domains are built you
> just need to know which cpumask to use when the sched_groups and
> domain's span are determined.
>
> Thus I think the topology detection is rather self-contained and
> can/should be provided independent of how the scheduler side is
> going to be implemented.

This is the sysfs bits? What is this needed for exactly? The
scheduler is pretty much the most important thing to tune in a
topology aware manner, besides memory allocations.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/