Re: [PATCH 2/3] x86: x2apic/cluster: Make use of lowest prioritydelivery mode

From: Suresh Siddha
Date: Mon May 21 2012 - 14:38:47 EST


On Mon, 2012-05-21 at 11:18 -0700, Linus Torvalds wrote:
> On Mon, May 21, 2012 at 11:07 AM, Suresh Siddha
> <suresh.b.siddha@xxxxxxxxx> wrote:
> >
> > All the cluster members of a given x2apic cluster belong to the same
> > package. These x2apic cluster id's are setup by the HW and not by the
> > SW. And only one cluster (with one or multiple members of that cluster
> > set) can be specified in the interrupt destination field of the routing
> > table entry.
>
> Ok, then the main question ends up being if there are enough cache or
> power domains within a cluster to still worry about it.

There are 16 members with in a x2apic cluster. With two HT siblings,
that will still leave 8-cores.

>
> For example, you say "package", but that can sometimes mean multiple
> dies, or even just split caches that are big enough to matter
> (although I can't think of any such right now on the x86 side - Core2
> Duo had huge L2's, but they were shared, not split).

Most likely multiple dies or split caches will have different
cluster-id's. I don't know of any upcoming implementations that will
have such an implementation supporting x2apic, but will keep an eye.

>
> > Power aware interrupt routing in IVB does this. And the policy of
> > whether you want the interrupt to be routed to the busy core (to save
> > power) or an idle core (for minimizing the interruptions on the busy
> > core) can be selected by the SW (using IA32_ENERGY_PERF_BIAS MSR).
>
> Sounds like we definitely would want to support this at least in the
> IVB timeframe then.
>
> But I do agree with Ingo that it would be really good to actually see
> numbers (and no, I don't mean "look here, now the irq's are nicely
> spread out", but power and/or performance numbers showing that it
> actually helps something).

I agree. This is the reason why I held up posting these patches before.
I can come up with micro-benchmarks that can show some difference but
the key is to find good workload/benchmark that can show measurable
difference. Any suggestions?

thanks,
suresh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/