On Fri, Jul 24, 2015 at 2:17 PM, Sudeep Holla <sudeep.holla@xxxxxxx> wrote:
On 24/07/15 06:02, Jassi Brar wrote:That statement is still wrong. The TXDONE_BY_POLL modifier does't make it right.
On Wed, Jul 22, 2015 at 5:58 PM, Sudeep Holla <sudeep.holla@xxxxxxx>
wrote:
we might end-up waitingThat is wrong.
for atleast a jiffy even though the response for that message from the
remote is received via interrupt and processed in relatively smaller
time granularity.
No see below.
If the controller supports TX interrupt it should set txdone_irq,
which prevents polling i.e, controller driver calls mbox_chan_txdone.
If the controller doesn't support TX interrupt but the client
receives some ack packet, then the client should set knows_txdone and
call mbox_client_txdone. Again you don't have to wait on polling.
Sorry if I was not clear in the commit message, but I thought I did
mention TXDONE_BY_POLL. The case I am referring is definitely not
TXDONE_BY_IRQ or TXDONE_BY_ACK.
Anyways, I see you meant the 3rd case of neither IRQ nor ACK.
Few reasons I have in mind as why we need that:Can I see the code somewhere?
1. The remote processor is capable of dealing with requests in orders of
milliseconds. E.g. on Juno it reports the DVFS transition latency is
1.2ms. Now if we report that to CPUFreq governor, we need to ensure
that. With current jiffy based timer we see latencies > 10ms.
It seems your remote doesn't send some protocol level 'ack' packet
replying if the command was successfully executed or not. That means
Linux can't differentiate successful execution of the command from a
silent failure (remote still has to set the TX_done flag to make way
for next messages).
From Linux' POV our best bet seems to be to submit
the request and simply assume it done. So probably do async rather
than blocking.
2. Because of that, under stress testing with multiple clients active atYeah this situation may arise. The following fix is needed regardless, I think.
a time, I am seeing the mailbox buffer overflows quite easily just
because it's blocked on Tx polling(almost 10x slower) and doesn't
process new requests though the remote can handle.
============================
diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c
index 07fd507..a1dded9 100644
--- a/drivers/mailbox/mailbox.c
+++ b/drivers/mailbox/mailbox.c
@@ -64,7 +64,12 @@ static void msg_submit(struct mbox_chan *chan)
spin_lock_irqsave(&chan->lock, flags);
- if (!chan->msg_count || chan->active_req)
+ if (chan->active_req) {
+ err = 0;
+ goto exit;
+ }
+
+ if (!chan->msg_count)
goto exit;
count = chan->msg_count;
============================
3. Also there are efforts for scheduler-driven cpu frequency selectionSure, latency should be as low as possible. Sounds similar to (1). Was
where the desired latency should be as low as possible. Juri(cc-ed)
is working on that and reported this issue with mailbox core.
it reported on some mailing list?
Hope this clarifies the reasons for switching to hrtimer.I am not against using hrtimer, just need to make sure we don't simply
suppress the symptoms of wrong implementation.