Re: [PATCH] hrtimer: Update softirq_expires_next correctly after __hrtimer_get_next_event()

From: Thomas Gleixner
Date: Thu Apr 22 2021 - 11:35:18 EST


On Thu, Apr 22 2021 at 23:20, Lorenzo Colitti wrote:

> On Thu, Apr 22, 2021 at 9:08 AM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>> Just for comparison. In a VM I'm experimenting with right now the
>> reprogramming time is ~500ns which is still a lot of cycles, but
>> compared to 5us faster by an order of magnitude. And on the sane
>> machine bare metal its way faster and therefore less noticeable.
>
> FWIW, on this hardware, frtrace says that arming the arm64 architected
> timer takes 0.7us. Definitely better than 2-3us, but still not free.
> This is not a high-end desktop or server, but it's also not super
> slow, low-power hardware.
>
>> * The transmit should only be run if no skb data has been sent for a
>> * certain duration.
>>
>> which is useless word salad.
>
> You're the one who wrote that comment - see b1a31a5f5f27. You'll
> forgive me for being amused. :-)

Rightfully so! I still call it word salat :)

> Thanks for the history/analysis/suggestions. I think it's a fact that
> this is a regression in performance: this particular code has
> performed well for a couple of years now. The fact that the good
> performance only existed due to a correctness bug in the hrtimer code
> definitely does make it harder to argue that the regression should be
> reverted.

We tend to disagree about the naming conventions here, but we seem at
least to agree that reverting a fix for a correctness bug (which has way
worse implications than slowing down a gruesome driver) is not going to
happen.

> That said: if you have a fix for the double reprogram, then that fix
> should probably be applied? 0.5us is not free, and even if hrtimers
> aren't designed for frequent updates, touching the hardware twice as
> often does seem like a bad idea, since, as you point out, there's a
> *lot* of hardware that is slow.

That's an obvious improvement, but not a fix. And I checked quite some
hrtimer users and there are only a few which ever rearm an queued timer
and that happens infrequently.

> Separately, we're also going to look at making ncm better. (In defense
> of the original author, in 2014 I don't think anyone would even have
> dreamed of USB being fast enough for this to be a problem.) The first
> thing we're going to try to do is set the timer once per NTB instead
> of once per packet (so, 10x less). My initial attempt to do that
> causes the link to die after a while and I need to figure out why
> before I can send a patch up. I'm suspicious of the threading, which
> uses non-atomic variables (timer_force_tx, ncm->skb_tx_data) to
> synchronize control flow between the timer and the transmit function,
> which can presumably run on different CPUs. That seems wrong since
> either core could observe stale variables. But perhaps there are
> memory barriers that I'm not aware of.

Not that I see any.

> The idea of getting rid of the timer by doing aggregation based on
> transmit queue lengths seems like a much larger effort, but probably
> one that is required to actually improve performance substantially
> beyond what it is now.

I don't think it's a huge effort. netdev_xmit_more() should tell you
what you need to know.

Thanks,

tglx