Re: [PATCH] locking/rwsem: Make owner store task pointer of last owning reader
From: Waiman Long
Date: Mon Sep 10 2018 - 11:01:13 EST
On 09/10/2018 05:31 AM, Peter Zijlstra wrote:
> On Thu, Sep 06, 2018 at 04:18:34PM -0400, Waiman Long wrote:
>> Currently, when a reader acquires a lock, it only sets the
>> RWSEM_READER_OWNED bit in the owner field. The other bits are simply
>> not used. When debugging hanging cases involving rwsems and readers,
>> the owner value does not provide much useful information at all.
>>
>> This patch modifies the current behavior to always store the task_struct
>> pointer of the last rwsem-acquiring reader in a reader-owned rwsem. This
>> may be useful in debugging rwsem hanging cases especially if only one
>> reader is involved. However, the task in the owner field may not the
>> real owner or one of the real owners at all when the owner value is
>> examined, for example, in a crash dump. So it is just an additional
>> hint about the past history.
>>
>> If CONFIG_DEBUG_RWSEMS is enabled, the owner field will be checked at
>> unlock time too to make sure the task pointer value is valid. That does
>> have a slight performance cost and so is only enabled as part of that
>> debug option.
>>
>> From the performance point of view, it is expected that the changes
>> shouldn't have any noticeable performance impact. A rwsem microbenchmark
>> (with 48 worker threads and 1:1 reader/writer ratio) was ran on a
>> 2-socket 24-core 48-thread Haswell system. The locking rates on a
>> 4.19-rc1 based kernel were as follows:
>>
>> 1) Unpatched kernel: 543.3 kops/s
>> 2) Patched kernel: 549.2 kops/s
>> 3) Patched kernel (CONFIG_DEBUG_RWSEMS on): 546.6 kops/s
>>
>> There was actually a slight increase in performance (1.1%) in this
>> particular case. Maybe it was caused by the elimination of a branch or
>> just a testing noise. Turning on the CONFIG_DEBUG_RWSEMS option also
>> had less than the expected impact on performance.
>>
>> The least significant 2 bits of the owner value are now used to designate
>> the rwsem is readers owned and the owners are anonymous.
> So no immediate objection; but I'm still hoping to some day rewrite the
> whole rwsem thing along the lines we did mutex and merge the 'count' and
> 'owner' fields into one.
>
> [ RT has something along those lines, and I have half a patch that
> starts doing that too, but I never found enough time to really work on
> it :-( ]
>
> Anyway, when we do something like that, this goes out the window of
> course.
That may not be the case. I have also thought about merging owner and
count. For writer, it can write its owner value to the combined
count/owner. For readers, I think we will need to keep a reader count
and so cannot store its owner value there. In that case, we could have a
debug-only field for the reader-owner value.
One major issue with a combined count/owner is that we may have to use
cmpxchg for reader lock which will certainly impact reader-heavy
workloads. I have also thought about ways to compress the task pointer
address so that it can use fewer bits and leave the rests for reader
count. It is probably doable on 64-bit systems, but likely not on 32-bit
system given that there are less bits to play around.
I would certainly like to hear your thought on how to handle these
dilemmas as I may be able to find time to work on that. I had a reader
optimistic spinning patch that was sent out a while ago. I know it won't
get serious time for review until we can resolve this combine
owner/count issue first.
Cheers,
Longman