Re: [PATCH 1/3] mailbox: always wait in mbox_send_message for blocking Tx mode

From: Jassi Brar
Date: Wed Mar 29 2017 - 13:46:23 EST


On Wed, Mar 29, 2017 at 5:04 PM, Sudeep Holla <sudeep.holla@xxxxxxx> wrote:
>
> On 28/03/17 19:20, Jassi Brar wrote:
>> Hi Sudeep,
>>
>> On Tue, Mar 21, 2017 at 5:00 PM, Sudeep Holla <sudeep.holla@xxxxxxx> wrote:
>>> There exists a race when msg_submit return immediately as there was an
>>> active request being processed which may have completed just before it's
>>> checked again in mbox_send_message. This will result in return to the
>>> caller without waiting in mbox_send_message even when it's blocking Tx.
>>>
>>> This patch fixes the issue by waiting for the completion always if Tx
>>> is in blocking mode.
>>>
>>> Fixes: 2b6d83e2b8b7 ("mailbox: Introduce framework for mailbox")
>>> Cc: Jassi Brar <jassisinghbrar@xxxxxxxxx>
>>> Reported-by: Alexey Klimov <alexey.klimov@xxxxxxx>
>>> Signed-off-by: Sudeep Holla <sudeep.holla@xxxxxxx>
>>> ---
>>> drivers/mailbox/mailbox.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> Hi Jassi,
>>>
>>> Here are fixes for few issues we encountered when dealing with multiple
>>> requests on multiple channels simultaneously.
>>>
>> Thanks for finding the bug.
>>
>> I see patch-1 tries to fix the bug. Patch-2,3 try to fix the
>> ramifications of the bug but they may change behaviour for some users.
>> Do you face any issue even after applying patch-1 ?
>>
>
> Unfortunately yes. Are you concerned with the change in return value on
> timeout ? I understand and then I chose -ETIME vs -ETIMEDOUT as hardware
> can still use it and we can distinguish the software timer expiry from
> that. Even -EIO seems incorrect for s/w timeout as it exists today, but
> I agree it has some impact on existing users.
>
> Also Patch 3 seems independent for me just to avoid complete call if it
> was empty message.
>
> Alexey also brought up another issue which is relating to ordering and
> may require completion flags per message instead of per channel. Today
> we can't guarantee that first blocker on the wait queue is same as the
> first in the mailbox queue.
>
> e.g.:
> Thread#1(T1) Thread#2(T2)
> mbox_send_message mbox_send_message
> | |
> V |
> add_to_rbuf(M1) V
> | add_to_rbuf(M2)
> | |
> | V
> V msg_submit(picks M1)
> msg_submit |
> | V
> V wait_for_completion(on M2)
> wait_for_completion(on M1) | (1st in waitQ)
> | (2nd in waitQ) V
> V wake_up(on completion of M1)<--incorrect
>
Yes, that is a possibility. I have sent a fix for this. It would help
if Alexey/you could give it a try.

Thanks