Re: [PATCH v5 2/8] firmware: add support for ARM System Control and Power Interface(SCPI) protocol

From: Sudeep Holla
Date: Wed Aug 05 2015 - 06:57:44 EST




On 31/07/15 14:45, Jassi Brar wrote:
On Fri, Jul 31, 2015 at 6:38 PM, Sudeep Holla <sudeep.holla@xxxxxxx> wrote:


On 31/07/15 11:43, Sudeep Holla wrote:



On 31/07/15 11:38, Jassi Brar wrote:

On Fri, Jul 31, 2015 at 3:10 PM, Sudeep Holla <sudeep.holla@xxxxxxx>
wrote:


I forgot to mention, we have a the following description in
mbox_client_txdone which is misleading:

"The client/protocol had received some 'ACK' packet and it notifies the
API that the last packet was sent successfully. This only works if the
controller can't sense TX-Done."

which is clearly not the case in SCPI. IMO we may have to reword that.

Yes. And also see whether it could race with polling driven tx_tick.


Yes I am also looking at that now while I am trying to check if
TXDONE_BY_ACK works on Juno, will keep you posted.


OK, I recollect the racy condition now which I had in my mind from the
beginning convincing myself why we can't use that option. I was not good
in words to explain it so far but let me try with the ASCII art this
time. Note Tx ACK below means the remote setting the register flag and
not to be confused with the ACK packet. For simplicity Rx can be assumed
to be Tx ACK packet

Time MHU/SCPI Remote SCP
| |
1 |------------ Tx1 -------------->|
| |
2 |<----------- Tx1 ACK -----------|
| |
3 |------------ Tx2 -------------->|
| |
4 |<----------- Rx1 ---------------|
| |
5 |<----------- Tx2 ACK -----------|
| |
6 |------------ Tx3 -------------->|
| |
7 |<----------- Rx2 ---------------|

Now lets consider the above scenario, suppose we have TXDONE_BY_ACK
and use mbox_client_txdone in Rx interrupt(i.e. response packet), we end
up in the race easily IIUC.

E.g. A client would have sent Tx2(3) before Rx1 interrupt(4) and if we
ACK Tx1 now in (3), we will end up acknowledging Tx2(5) as the mailbox
core assumes only one Tx request at a time which is clearly not the case
in our setup. The client can then go ahead and send Tx3(6) overwriting
the payload while remote was processing or even result in remote missing
the request completely. Does that make sense or am I missing something ?

Yeah, that could happen. But the race can be fixed by checking
last_tx_done if the controller provides that. If last_tx_done is not
implemented, polling won't race.

Please try the following...


Looks good to me. Sometimes it takes very long time(days) to hit race
conditions(esp. in firmware), so I need some time to think before
I cook up a patch to start stress test this on Juno so that I don't
waste time waiting for result.

Regards,
Sudeep
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/