Re: [PATCH] mailbox: pcc: Fix probabilistic command execution timeout

From: Sudeep Holla

Date: Tue Apr 21 2026 - 06:07:15 EST


On Fri, Apr 17, 2026 at 11:14:29AM +0800, Huisong Li wrote:
> In some scenarios, PCC command may experience probabilistic timeout.
> This is primarily caused by the chan_in_use flag being updated after
> ringing the doorbell, coupled with a lack of proper memory barriers
> across CPU cores.
>
> On fast platforms, a race condition occurs: the platform processing
> completes and triggers an interrupt before the local CPU sets
> chan_in_use to true. When the interrupt handler pcc_mbox_irq() runs,
> it reads chan_in_use as false and incorrectly ignores the interrupt.
>
> This patch fixes the race by:
> 1. Moving the chan_in_use update before ringing the doorbell.
> 2. Using smp_store_release() to ensure the flag update is visible
> to other cores before subsequent hardware or software actions
> are triggered.
> 3. Using smp_load_acquire() in the interrupt handler to ensure the
> latest flag value is read before deciding to skip the interrupt.
>

Are you seeing the issue on real platforms or you are just reviewing the
code. I would like to test it on the platform I use but I don't have it
handy, so may take some time.

I have added Robbie King who also has helped in testing this PCC driver
in the past.

--
Regards,
Sudeep