Re: [PATCH 7/7] ixgbevf: eliminate duplicate barriers on weakly-ordered archs
From: Alexander Duyck
Date: Thu Mar 15 2018 - 12:58:15 EST
On Thu, Mar 15, 2018 at 9:27 AM, Sinan Kaya <okaya@xxxxxxxxxxxxxx> wrote:
> On 3/15/2018 12:21 PM, Sinan Kaya wrote:
>> On 3/15/2018 10:32 AM, Alexander Duyck wrote:
>>> We tend to do something like:
>>> update tx_buffer_info
>>> update tx_desc
>>> wmb()
>>> point first tx_buffer_info next_to_watch value at last tx_desc
>>> update next_to_use
>>> notify device via writel
>>>
>>> We do it this way because we have to synchronize between the Tx
>>> cleanup path and the hardware so we basically lump the two barriers
>>> together. instead of invoking both a smp_wmb and a wmb. Now that I
>>> look at the pseudocode though I wonder if we shouldn't move the
>>> next_to_use update before the wmb, but that might be material for
>>> another patch. Anyway, in the Tx cleanup path we should have an
>>> smp_rmb() after we read the next_to_watch values so that we avoid
>>> reading any of the other fields in the buffer_info if either the field
>>> is NULL or the descriptor pointed to has not been written back.
>>
>> How do you feel about keeping wmb() very close to writel_relaxed() like this?
>>
>> update tx_buffer_info
>> update tx_desc
>> point first tx_buffer_info next_to_watch value at last tx_desc
>> update next_to_use
>> wmb()
>> notify device via writel_relaxed()
>>
>> I'm afraid that if the order of wmb() and writel() is not very
>> obvious or hidden in multiple functions, somebody can introduce a very nasty
>> bug in the future.
>>
>> We also have to think about code maintenance.
>>
>
> Now that I read your email again, I think this is the reason if I understood you
> correctly.
>
> "instead of invoking both a smp_wmb and a wmb"
>
> You'd need something like
>
> update tx_buffer_info
> update tx_desc
> smp_wmb()
> point first tx_buffer_info next_to_watch value at last tx_desc
> update next_to_use
> wmb()
> notify device via writel_relaxed()
>
> Let me work on your comments.
Yes, we would be doing something like that, but we are using just a
single wmb() to cover both cases since hardware will never look at the
tx_buffer_info and software will never read that descriptor ring as
long as the next_to_watch is NULL. By doing it this way we should have
both cases covered and not need to worry
The only other bit still remaining is the "maybe_stop_tx" logic which
lives between the wmb and writel_relaxed. That logic has a smp_mb
living in it that is triggered if we have to stop the queue. Once
again though that is only viewed by software so it existing between
the wmb and the writel_relaxed should not be an issue.
Starting to understand why I was a bit hesitant to have us start
taking on these changes now? :-)
- Alex