Re: [PATCH v2] ath10k: transmit queued frames after waking queues

From: Niklas Cassel
Date: Wed May 23 2018 - 17:51:45 EST


On Wed, May 23, 2018 at 06:25:49PM +0200, Erik Stromdahl wrote:
>
>
> On 05/22/2018 11:15 PM, Niklas Cassel wrote:
>
> <snip>
> > >
> > > Earlier we observed performance issues in calling push_pending from each
> > > tx completion. IMHO this change may introduce the same problem again.
> >
> > I prefer functional TX over performance issues,
> > but I agree that it is unfortunate that SDIO doesn't use
> > ath10k_htt_txrx_compl_task().
> > Erik, is there a reason for this?
> The reason is that the SDIO code has been derived mainly from qcacld and ath6kl
> and they don't implement napi.
>
> ath10k_htt_txrx_compl_task is currently only called from the napi poll function,
> and the sdio bus driver doesn't have such a function.

Ok, thanks for the explanation. Perhaps we can change the SDIO code so that it
uses NAPI in the future.

<snip>

> > Another solution might be to change so that we only call
> > ath10k_mac_tx_push_pending() from ath10k_txrx_tx_unref()
> > if (htt->num_pending_tx == 0). That should decrease the number
> > of calls to ath10k_mac_tx_push_pending(), while still avoiding
> > a "TX deadlock" scenario for SDIO.
> Just out of curiosity, where did the limit of 3 come from?
> If it works with a limit of 0, I think it should be used instead.

It came from mt76_txq_schedule():

if (hwq->swq_queued >= 4 || list_empty(&hwq->swq))
break;

len = mt76_txq_schedule_list(dev, hwq);

Since this used a break, I simply inverted the logic,
and called ath10k_mac_tx_push_pending() rather than
mt76_txq_schedule_list().

However, I've submitted a V4 now that mimics the behavior
in ath10k_htt_txrx_compl_task() instead, so now I call
ath10k_mac_tx_push_pending() regardless of num_pending_tx.

In most cases, ath10k_mac_tx_push_pending() will not dequeue
any frames, since the ar->txqs list will be empty, so this
shouldn't be so bad after all.

>
> Another intersting thing that I stumbled upon when looking into the
> code (while writing this email) is the *wake_up(&htt->empty_tx_wq);*
>
> For some reason I have considered it not to be applicable for HL devices.
>
> The queue is waited for in the flush op (*ath10k_flush*).
> I am unsure what it is used for, but I don't think it affects the TX
> deadlock scenario.

It seems to be called by mac80211 in certain scenarios, but like you said,
it doesn't help with this problem.


Regards,
Niklas