Waiman,
On Mon, 15 Jul 2013, Waiman Long wrote:On 07/15/2013 06:31 PM, Thomas Gleixner wrote:I'm well aware of this. But that does not explain anything of what IOn Fri, 12 Jul 2013, Waiman Long wrote:The main point is that the regular rwlock is not fair while theApparently, the regular read/write lock performs even better thanThe regular rwlock performs better in most cases. This is the full
the queue read/write lock in some cases. This is probably due to the
list comparing both against the ticket lock.
qrlock rwlock
+20.7 +44.4
+30.1 +42.9
+56.3 +63.3
+52.9 +48.8
+54.4 +65.1
+49.2 +26.5
So you try to sell that qrwlock as a replacement for ticket spinlocks,
while at the same time you omit the fact that we have an even better
implementation (except for the last test case) already in the
kernel. What's the point of this exercise?
queue rwlock is close to as fair as the ticket spinlock. The LWN
article http://lwn.net/Articles/364583/ mentioned about eliminating
rwlock altogether precisely because of this unfairness as it can
cause livelock in certain scenerio. I also saw slides to advise
again using rwlock because of this.
asked.
That's exactly the kind of explanation we want to have in the code andDeterministic means that that a process can acquire a lock within a+ * has the following advantages:Why is it more deterministic than the existing implementation?
+ * 1. It is more deterministic. Even though there is a slight chance
of
reasonable time period without being starved for a long time. The qrwlock
grants lock in FIFO order in most cases. That is what I mean by being more
deterministic.
the changelog.
That makes sense and wants to be documented as well. You could haveThe current rwlock implementation suffers from a thundering herd problem.+ * stealing the lock if come at the right moment, the granting ofAgain, why is it faster?
the
+ * lock is mostly in FIFO order.
+ * 2. It is faster in high contention situation.
When many readers are waiting for the lock hold by a writer, they will all
jump in more or less at the same time when the writer releases the lock.
That is not the case with qrwlock. It has been shown in many cases that
avoiding this thundering herd problem can lead to better performance.
avoided a lot of the discussion if you had included these details
right away.
And please provide a proper argument why we can't use the sameI will try to collect more data to justify the usefulness of qrwlock.+ * an increase in lock size is not an issue.So is it faster in the general case or only for the high contention or
single thread operation cases?
And you still miss to explain WHY it is faster. Can you please explain
proper WHY it is faster and WHY we can't apply that technique you
implemented for qrwlocks to writer only locks (aka spinlocks) with a
smaller lock size?
technique for spinlocks.
Looking at patch 2/2:Aside of that, you are replacing all RW locks unconditionally by thisUsers have the choice of using the old rwlock or the queue rwlock by
new fangled thing, but did you actually run tests which look at other
rwlock usage sites than the particular one you care about?
selecting or unselecting the QUEUE_RWLOCK config parameter. I am not
forcing the unconditional replacement of rwlock by qrwlock.
+config ARCH_QUEUE_RWLOCK
+ def_bool y
What's conditional about that? Where is the choice?
Yes, please. We really need this information and if it turns out, thatYou are optimizing for the high frequency writer case. And that's notIt is true that this lock is kind of optimized for writers. For
the primary use case for rwlocks. That's the special use case for the
jbd2 journal_state_lock which CANNOT be generalized for all other
rwlock usage sites.
reader heavy code, the performance may not be as good as the rwlock
for uncontended cases. However, I do believe that the fairness
attribute of the qrwlock far outweigh the slight performance
overhead of read lock/unlock. Furthermore, the lock/unlock sequence
contributes only a very tiny percentage of total CPU time in
uncontended cases. A slight increase may not really have a material
impact on performance. Again, as promised, I will try to collect
some more performance data for reader heavy usage cases.
it does not affect reader heavy sides, I have no objections against
the technology itself.