Re: 20% performance drop on PostgreSQL 9.2 from kernel 3.5.3 to3.6-rc5 on AMD chipsets - bisected

From: Mel Gorman
Date: Tue Sep 25 2012 - 09:23:10 EST


On Mon, Sep 24, 2012 at 07:44:17PM +0200, Peter Zijlstra wrote:
> On Mon, 2012-09-24 at 18:54 +0200, Peter Zijlstra wrote:
> > But let me try and come up with the list thing, I think we've
> > actually got that someplace as well.
>
> OK, I'm sure the below can be written better, but my brain is gone for
> the day...
>

It crashes on boot due to the fact that you created a function-scope variable
called sd_llc in select_idle_sibling() and shadowed the actual sd_llc you
were interested in. Result: dereferenced uninitialised pointer and kaboom.
Trivial to fix so it boots at least.

This is a silly test for a scheduler patch but as "sched: Avoid SMT siblings
in select_idle_sibling() if possible" regressed 2% back in 3.2, it seemed
reasonable to retest with it.

KERNBENCH
3.6.0 3.6.0 3.6.0
rc6-vanilla rc6-mikebuddy-v1r1 rc6-idlesibling-v1r1
User min 352.47 ( 0.00%) 351.77 ( 0.20%) 352.30 ( 0.05%)
User mean 353.10 ( 0.00%) 352.78 ( 0.09%) 352.77 ( 0.09%)
User stddev 0.41 ( 0.00%) 0.56 (-36.13%) 0.35 ( 15.16%)
User max 353.55 ( 0.00%) 353.43 ( 0.03%) 353.31 ( 0.07%)
System min 34.86 ( 0.00%) 34.83 ( 0.09%) 35.37 ( -1.46%)
System mean 35.35 ( 0.00%) 35.29 ( 0.16%) 35.63 ( -0.80%)
System stddev 0.41 ( 0.00%) 0.40 ( 0.10%) 0.15 ( 62.26%)
System max 35.94 ( 0.00%) 36.05 ( -0.31%) 35.81 ( 0.36%)
Elapsed min 110.18 ( 0.00%) 109.65 ( 0.48%) 110.04 ( 0.13%)
Elapsed mean 110.21 ( 0.00%) 109.75 ( 0.42%) 110.15 ( 0.06%)
Elapsed stddev 0.03 ( 0.00%) 0.07 (-167.83%) 0.09 (-207.56%)
Elapsed max 110.26 ( 0.00%) 109.86 ( 0.36%) 110.26 ( 0.00%)
CPU min 352.00 ( 0.00%) 353.00 ( -0.28%) 352.00 ( 0.00%)
CPU mean 352.00 ( 0.00%) 353.00 ( -0.28%) 352.00 ( 0.00%)
CPU stddev 0.00 ( 0.00%) 0.00 ( 0.00%) 0.00 ( 0.00%)
CPU max 352.00 ( 0.00%) 353.00 ( -0.28%) 352.00 ( 0.00%)

mikebuddy-v1r1 is Mike's patch that just got reverted. idlesibling is
Peters patch. "Elapsed mean" time is the main value of interest. Mike's
patch gains 0.42% which is less than the 2% lost but at least the gain is
outside the noise. idlesibling make very little difference. "System mean"
is also interesting because even though idlesibling shows a "regression", it
also shows that the variation between runs is reduced. That might indicate
that fewer cache misses are being incurred in the select_idle_sibling()
code although that is a bit of a leap of faith.

The machine is in use at the moment but I'll queue up a test this evening to
gather a profile to confirm time is even being spent in select_idle_sibling()
Just because 2% was lost in select_idle_sibling() back in 3.2 does not
mean squat now.

--
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/