Re: [PATCH v8 4/5] locking/qspinlock: Introduce starvation avoidance into CNA

From: Alex Kogan
Date: Fri Jan 24 2020 - 16:28:37 EST

Next message: Song Liu: "Re: [PATCH v3] perf/core: fix mlock accounting in perf_mmap()"
Previous message: Song Liu: "[LSF/MM/BPF TOPIC] ls+cat based debugging for BPF"
In reply to: Waiman Long: "Re: [PATCH v8 4/5] locking/qspinlock: Introduce starvation avoidance into CNA"
Next in thread: Waiman Long: "Re: [PATCH v8 4/5] locking/qspinlock: Introduce starvation avoidance into CNA"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

> On Jan 24, 2020, at 4:12 PM, Waiman Long <longman@xxxxxxxxxx> wrote:
>
> On 1/24/20 3:09 PM, Alex Kogan wrote:
>>>> We also probably do not want those âprioritizedâ threads to disrupt
>>>> normal
>>>> CNA operation. E.g., if the main queue looks like T1_1, P2_1, T1_2,
>>>> â, where
>>>> T1_x is a thread running on node 1 and P2_1 is a prioritized thread
>>>> running
>>>> on node 2, we want to pass the lock from T1_1 to P2_1 and then to T1_2
>>>> (rather than have P2_1 to scan for another thread on node 2).
>>>>
>>>> There is a way to achieve that â when we pass the lock to P2_1,
>>>> we can set its numa node field to 1. This means that we need to
>>>> reset the numa
>>>> node field in cna_init_node(), but AFAICT this is relatively cheap.
>>>> The rest
>>>> of the CNA logic should not change.
>>>
>>> I won't recommend doing that. If the lock cacheline has been moved
>>> from node 1 to 2, I will say it is better to stick with node 2 rather
>>> than switching back to node 1. That will mean that the secondary
>>> queue may contain lock waiters from the same nodes, but they will
>>> eventually be flushed back to the primary queue.
>>>
>> Thatâs right, assuming we do not reset intra_node count when
>> transferring the
>> lock to a prioritized thread from another node. Otherwise, we may starve
>> waiters in the secondary queue.
>>
>> Still, that can make the lock even less fair to non-prioritized
>> threads. When
>> you flush the secondary queue, the preference may remain with the same
>> node. This will not happen in the current form of CNA, as we never get
>> threads from the preferred node in the secondary queue.
>
> That is true.
>
> However, it is no different from the current scheme that a waiter from
> another node may have to wait for 64k other waiters to go first before
> it has a chance to get it. Now that waiter can be from the same node as
> well.

The difference is that in the current form of CNA, the preferred node _will
change after 64k lock transitions. In the change you propose, this is no
longer the case. It may take another ~64k transitions for that to happen.
More generally, I think this makes the analysis of the lock behavior more
convoluted.

I think we should treat those prioritized threads as âwildâ cards, passing the
lock through them, but keeping the preferred node intact. This will potentially
cost one extra lock migration, but will make reasoning about the lock
behavior easier.

Regards,
â Alex

Next message: Song Liu: "Re: [PATCH v3] perf/core: fix mlock accounting in perf_mmap()"
Previous message: Song Liu: "[LSF/MM/BPF TOPIC] ls+cat based debugging for BPF"
In reply to: Waiman Long: "Re: [PATCH v8 4/5] locking/qspinlock: Introduce starvation avoidance into CNA"
Next in thread: Waiman Long: "Re: [PATCH v8 4/5] locking/qspinlock: Introduce starvation avoidance into CNA"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]