Re: [patch] Re: PostgreSQL pgbench performance regression in 2.6.23+

From: Greg Smith
Date: Fri Jun 06 2008 - 01:04:28 EST


On Tue, 27 May 2008, Mike Galbraith wrote:

Care to give the below a whirl? If fixes the over-enthusiastic affinity
bug in a less restrictive way. It doesn't attempt to addresss the needs
of any particular load though, that needs more thought (tricky issue).

With default features, I get the below...

Sorry I didn't get back to you until now, got distracted for a bit. Here's my table now updated with this patched version and with your numbers for comparision, since we have the same basic processor setup:

Clients .22.19 .26.git patch Mike
1 7660 11043 11003 10122
2 17798 11452 16868 14360
3 29612 13231 20381 17049
4 25584 13053 22222 18749
6 25295 12263 23546 24913
8 24344 11748 23895 27976
10 23963 11612 22492 29347
15 23026 11414 21896 29157
20 22549 11332 21015 28392
30 22074 10743 18411 26590
40 21495 10406 17982 24422
50 20051 10534 17009 23306

So this is a huge win for this patch compared with the stock 2.6.26.git (I'm still using the daily snapshot from 2008-05-26) and a nice improvement over the earlier, smaller patches I tested in this thread (which peaked at 19537 for 10 clients for me with default features, vs. a peak of 23895 @ 8 here).

I think I might not be testing exactly the same thing you did, though, because the pattern doesn't match. I think that my Q6600 system runs a little bit faster than yours, which is the case for small numbers of clients here. But once we get above 8 clients your setup is way faster, with the difference at 15 clients being the largest. Were you perhaps using batch mode when you generated these results? Only thing I could think of that would produce this pattern. If it's not something simple like that, I may have to dig into whether there was some change in the git snapshot between what you tested and what I did.

Regardless, clearly your patch reduces the regression with the default parameters to a mild one instead of the gigantic one we started with. Considering how generally incompatible this benchmark is with this scheduler, and that there are clear workarounds (feature disabling) I can document in PostgreSQL land to "fix" the problem defined for me now, I'd be happy if all that came from this investigation was this change. I'd hope that being strengthened against this workload improves the scheduler's robustness for other tasks of this type, which I'm sure there are more of than just pgbench.

You get my vote for moving toward committing it+backport even if the improvement is only what I saw in my tests. If I can figure out how to get closer to the results you got, all the better.

--
* Greg Smith gsmith@xxxxxxxxxxxxx http://www.gregsmith.com Baltimore, MD
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/