Re: ipc,sem: sysv semaphore scalability

From: Davidlohr Bueso
Date: Thu Mar 21 2013 - 21:12:59 EST


On Wed, 2013-03-20 at 15:55 -0400, Rik van Riel wrote:
> Include lkml in the CC: this time... *sigh*
> ---8<---
>
> This series makes the sysv semaphore code more scalable,
> by reducing the time the semaphore lock is held, and making
> the locking more scalable for semaphore arrays with multiple
> semaphores.
>
> The first four patches were written by Davidlohr Buesso, and
> reduce the hold time of the semaphore lock.
>
> The last three patches change the sysv semaphore code locking
> to be more fine grained, providing a performance boost when
> multiple semaphores in a semaphore array are being manipulated
> simultaneously.
>
> On a 24 CPU system, performance numbers with the semop-multi
> test with N threads and N semaphores, look like this:
>
> vanilla Davidlohr's Davidlohr's + Davidlohr's +
> threads patches rwlock patches v3 patches
> 10 610652 726325 1783589 2142206
> 20 341570 365699 1520453 1977878
> 30 288102 307037 1498167 2037995
> 40 290714 305955 1612665 2256484
> 50 288620 312890 1733453 2650292
> 60 289987 306043 1649360 2388008
> 70 291298 306347 1723167 2717486
> 80 290948 305662 1729545 2763582
> 90 290996 306680 1736021 2757524
> 100 292243 306700 1773700 3059159
>

After testing these patches with my Oracle Swingbench DSS workload, I
can say that there are significant improvements. The ipc lock contention
was reduced drastically, specially with higher amounts of benchmark
users. As a result, the overall %sys time went down as well.
Furthermore, throughput (in transactions per second) was increased.

TPS:
100 users: 1257.21 (vanilla) 2805.06 (v3 patchset)
400 users: 1437.57 (vanilla) 2664.67 (v3 patchset)
800 users: 1236.89 (vanilla) 2750.73 (v3 patchset)

ipc lock contention:
100 users: 8,74% (vanilla) 3.17% (v3 patchset)
400 users: 21,86% (vanilla) 5.23% (v3 patchset)
800 users 84,35% (vanilla) 7.39% (v3 patchset)

As seen with perf, the ipc lock isn't even the main source of contention
anymore. Also, no matter how many benchmark users, the lock's user is
mostly semctl_main() .

100 users:
3.17% oracle [kernel.kallsyms] [k] _raw_spin_lock
|
--- _raw_spin_lock
|
|--50.53%-- sem_lock
| |
| |--82.60%-- semctl_main
| --17.40%-- sys_semtimedop

400 users:
5.23% oracle [kernel.kallsyms] [k] _raw_spin_lock
|
--- _raw_spin_lock
|
|--75.81%-- sem_lock
| |
| |--94.09%-- semctl_main
| --5.91%-- sys_semtimedop


800 users:
7.39% oracle [kernel.kallsyms] [k] _raw_spin_lock
|
--- _raw_spin_lock
|
|--81.71%-- sem_lock
| |
| |--64.98%-- semctl_main
| --35.02%-- sys_semtimedop


Thanks,
Davidlohr


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/