Re: Poor PostgreSQL scaling on Linux 2.6.25-rc5 (vs 2.6.22)

From: Nick Piggin
Date: Tue Mar 11 2008 - 21:22:21 EST


(Back onto lkml)

On Tuesday 11 March 2008 23:02, Ingo Molnar wrote:
> another thing to try would be to increase:
>
> /proc/sys/kernel/sched_migration_cost
>
> from its 500 usecs default to a few msecs ?

This doesn't really help either (at 10ms).

(For the record, I've tried turning SD_WAKE_IDLE, SD_WAKE_AFFINE
on and off for each domain and that hasn't helped either).

I've also tried increasing sched_latency_ns as far as it can go.
BTW. this is a pretty nasty behaviour if you ask my opinion. It
starts *increasing* the number of involuntary context switches
as resources get oversubscribed. That's completely unintuitive as
far as I can see -- when we get overloaded, the obvious thing to
do is try to increase efficiency, or at least try as hard as
possible not to lose it. So context switches should be steady or
decreasing as I add more processes to a runqueue.

It seems to max out at nearly 100 context switches per second,
and this has actually shown to be too frequent for modern CPUs
and big caches.

Increasing the tunable didn't help for this workload, but it really
needs to be fixed so it doesn't decrease timeslices as the number
of processes increases.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/