Re: [patch 1/2] x86, x2apic: minimize IPI register writes usingcluster groups v4

From: Suresh Siddha
Date: Mon May 02 2011 - 14:26:46 EST


On Mon, 2011-05-02 at 07:02 -0700, Cyrill Gorcunov wrote:
> On 05/02/2011 05:22 PM, Ingo Molnar wrote:
> >
> > * Cyrill Gorcunov <gorcunov@xxxxxxxxxx> wrote:
> >
> >> With this change, microbenchmark measuring the cost
> >> of flush_tlb_others(), with the flush tlb IPI being
> >> sent from a cpu in the socket-1 to all the logical
> >> cpus in socket-2 (on a Westmere-EX system that has
> >> 20 logical cpus in a socket) is 3x times better now
> >> (compared to the former 'send one-by-one' algorithm).
> >
> > What kind of microbenchmark was this, could the actual results and measurement
> > methods be shared as well?
>
> Suresh, could you please post the microbenchmark?

It is a simple kernel hack to measure the TSC cost of flush_tlb_others()
with and with out this change. 3x better was specifically for the test
condition where we called flush_tlb_others() on a logical cpu in
socket-1, which sent the flush tlb IPI to all the logical cpu's in
another socket.

This is done on WSM-EX which has 20 logical cpu's on one socket. 20
logical cpu's in that socket will fall under two cluster groups. So 2
batches of grouped IPI's vs 20 serialized(atleast the sending part)
IPI's.

thanks,
suresh



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/