Re: [RFC v5 09/11] mm: Try spin lock in speculative path

From: Laurent Dufour
Date: Thu Jul 06 2017 - 11:29:40 EST


On 06/07/2017 16:48, Peter Zijlstra wrote:
> On Thu, Jul 06, 2017 at 03:46:59PM +0200, Laurent Dufour wrote:
>> On 05/07/2017 20:50, Peter Zijlstra wrote:
>>> On Fri, Jun 16, 2017 at 07:52:33PM +0200, Laurent Dufour wrote:
>>>> @@ -2294,8 +2295,19 @@ static bool pte_map_lock(struct vm_fault *vmf)
>>>> if (vma_has_changed(vmf->vma, vmf->sequence))
>>>> goto out;
>>>>
>>>> - pte = pte_offset_map_lock(vmf->vma->vm_mm, vmf->pmd,
>>>> - vmf->address, &ptl);
>
>>>> + ptl = pte_lockptr(vmf->vma->vm_mm, vmf->pmd);
>>>> + pte = pte_offset_map(vmf->pmd, vmf->address);
>>>> + if (unlikely(!spin_trylock(ptl))) {
>>>> + pte_unmap(pte);
>>>> + goto out;
>>>> + }
>>>> +
>>>> if (vma_has_changed(vmf->vma, vmf->sequence)) {
>>>> pte_unmap_unlock(pte, ptl);
>>>> goto out;
>>>
>>> Right, so if you look at my earlier patches you'll see I did something
>>> quite disgusting here.
>>>
>>> Not sure that wants repeating, but I cannot remember why I thought this
>>> deadlock didn't exist anymore.
>>
>> Regarding the deadlock I did face it on my Power victim node, so I guess it
>> is still there, and the stack traces are quiet explicit.
>> Am I missing something here ?
>
> No, you are right in that the deadlock is quite real. What I cannot
> remember is what made me think to remove the really 'wonderful' code I
> had to deal with it.
>
> That said, you might want to look at how often you terminate the
> speculation because of your trylock failing. If that shows up at all we
> might need to do something about it.

Based on the benchmarks I run, it doesn't fail so much often, but I was
thinking about adding some counters here. The system is accounting for
major page faults and minor ones, respectively current->maj_flt and
current->min_flt. I was wondering if an additional type like async_flt will
be welcome or if there is another smarter way to get that metric.

Feel free to advise.

Thanks
Laurent.