Re: Industry db benchmark result on recent 2.6 kernels

From: Ingo Molnar
Date: Fri Apr 01 2005 - 02:09:21 EST



* Paul Jackson <pj@xxxxxxxxxxxx> wrote:

> Nick wrote:
> > Ingo had a cool patch to estimate dirty => dirty cacheline transfer latency
> > ... Unfortunately ... and it is an O(cpus^2) operation.
>
> Yes - a cool patch.
>
> If we had an arch-specific bit of code, that for any two cpus, could
> give a 'pseudo-distance' between them, where the only real
> requirements were that (1) if two pairs of cpus had the same
> pseudo-distance, then that meant they had the same size, layout, kind
> and speed of bus amd cache hardware between them (*), and (2) it was
> cheap - hardly more than a few lines of code and a subroutine call to
> obtain, then Ingo's code could be:

yeah. The search can be limited quite drastically if all duplicate
constellations of CPUs (which is a function of the topology) are only
measured once.

but can be 'pseudo-distance' be calculated accurately enough? If it's a
scalar, how do you make sure that unique paths for data to flow have
different distances? The danger is 'false sharing' in the following
scenario: lets say CPUs #1 and #2 are connected via hardware H1,H2,H3,
CPUs #3 and #4 are connected via H4,H5,H6. Each hardware component is
unique and has different characteristics. (e.g. this scenario can happen
when different speed CPUs are mixed into the same system - or if there
is some bus assymetry)

It has to be made sure that H1+H2+H3 != H4+H5+H6, otherwise false
sharing will happen. For that 'uniqueness of sum' to be guaranteed, one
has to assign power-of-two values to each separate type of hardware
component.

[ or one has to assing very accurate 'distance' values to hardware
components. (adding another source for errors - i.e. false sharing of
the migration value) ]

and even the power-of-two assignment method has its limitations: it
obviously runs out at 32/64 components (i'm not sure we can do that),
and if a given component type can be present in the same path _twice_,
that component will have to take two bits.

or is the 'at most 64 different hardware component types' limit ok? (it
feels like a limit we might regret later.)

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/