Re: [PATCH RFC 1/2] qrwlock: A queue read/write lock implementation

From: Waiman Long
Date: Tue Jul 23 2013 - 20:04:05 EST

Next message: Greg Kroah-Hartman: "Re: [PATCH v2 0/9] mostly lockless tty echo"
Previous message: Chen Gang: "Re: [PATCH, re-send] Always trap on BUG()"
In reply to: Ingo Molnar: "Re: [PATCH RFC 1/2] qrwlock: A queue read/write lock implementation"
Next in thread: George Spelvin: "Re: [PATCH RFC 1/2] qrwlock: A queue read/write lock implementation"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 07/22/2013 06:34 AM, Ingo Molnar wrote:

* Waiman Long<waiman.long@xxxxxx> wrote:

I had run some performance tests using the fserver and new_fserver
benchmarks (on ext4 filesystems) of the AIM7 test suite on a 80-core
DL980 with HT on. The following kernels were used:

1. Modified 3.10.1 kernel with mb_cache_spinlock in fs/mbcache.c
replaced by a rwlock
2. Modified 3.10.1 kernel + modified __read_lock_failed code as suggested
by Ingo
3. Modified 3.10.1 kernel + queue read/write lock
4. Modified 3.10.1 kernel + queue read/write lock in classic read/write
lock behavior

The last one is with the read lock stealing flag set in the qrwlock
structure to give priority to readers and behave more like the classic
read/write lock with less fairness.

The following table shows the averaged results in the 200-1000
user ranges:

+-----------------+--------+--------+--------+--------+
| Kernel | 1 | 2 | 3 | 4 |
+-----------------+--------+--------+--------+--------+
| fserver JPM | 245598 | 274457 | 403348 | 411941 |
| % change from 1 | 0% | +11.8% | +64.2% | +67.7% |
+-----------------+--------+--------+--------+--------+
| new-fserver JPM | 231549 | 269807 | 399093 | 399418 |
| % change from 1 | 0% | +16.5% | +72.4% | +72.5% |
+-----------------+--------+--------+--------+--------+

So it's not just herding that is a problem.

I'm wondering, how sensitive is this particular benchmark to fairness?
I.e. do the 200-1000 simulated users each perform the same number of ops,
so that any smearing of execution time via unfairness gets amplified?

I.e. does steady-state throughput go up by 60%+ too with your changes?

For this particular benchmark, there are interplay of different locks that determine the overall performance of the system. Yes, I got steady state performance gain of 60%+ with the qrwlock change with the modified mbcache.c. Without the modified mbcache.c file, the performance gain drop to 20-30%. I am still trying to find out more about the performance variations in different situations.

Regards,
Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Greg Kroah-Hartman: "Re: [PATCH v2 0/9] mostly lockless tty echo"
Previous message: Chen Gang: "Re: [PATCH, re-send] Always trap on BUG()"
In reply to: Ingo Molnar: "Re: [PATCH RFC 1/2] qrwlock: A queue read/write lock implementation"
Next in thread: George Spelvin: "Re: [PATCH RFC 1/2] qrwlock: A queue read/write lock implementation"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]