On Wed, Mar 05, 2025 at 11:45:35AM +0800, lihuisong (C) wrote:Yeah
在 2025/3/3 18:51, Sudeep Holla 写道:I understood it with the graph similar to the one above, though I simplified
The PCC mailbox interrupt handler (pcc_mbox_irq()) currently checksThis is not easy to understand to me.
for command completion flags and any error status before clearing the
interrupt.
The below sequence highlights an issue in the handling of PCC mailbox
interrupts, specifically when dealing with doorbell notifications and
acknowledgment between the OSPM and the platform where type3 and type4
channels are sharing the interrupt.
Platform Firmware OSPM/Linux PCC driver
------------------------------------------------------------------------
build message in shmem
ring type3 channel doorbell
receives the doorbell interrupt
process the message from OSPM
build response for the message
ring the platform ack interrupt to OSPM
--->
build notification in type4 channel
start processing in pcc_mbox_irq()
enter pcc handler for type4 chan
command complete cleared
read the notification
<--- clear platform ack irq
* no effect from above as platform ack irq *
* not yet triggered on this channel *
ring the platform ack irq on type4 channel
--->
leave pcc handler for type4 chan
enter pcc handler for type3 chan
command complete set
read the response
<--- clear platform ack irq
leave pcc handler for type3 chan
leave pcc_mbox_irq() handler
start processing in pcc_mbox_irq()
enter pcc handler for type4 chan
leave pcc handler for type4 chan
enter pcc handler for type3 chan
leave pcc handler for type3 chan
leave pcc_mbox_irq() handler
The issue as below described is already very clear to me.
So suggest remove above flow graph.
it in terms of PCC rather than specific IP reference.
Yes Robbie reported this. He is away and can't test or respond until nextThe key issue occurs when OSPM tries to acknowledge platform ackHas this issue been confired? It's more better if has the log.😁
interrupt for a notification which is ready to be read and processed
but the interrupt itself is not yet triggered by the platform.
This ineffective acknowledgment leads to an issue later in time where
the interrupt remains pending as we exit the interrupt handler without
clearing the platform ack interrupt as there is no pending response or
notification. The interrupt acknowledgment order is incorrect.
But it seems a valid issue.
week. The log just says there was loads of spurious interrupts and nobody
cared log as you got in the first patch of yours fixing similar race.
ok, thank you for clarifying to me.
Indeed, not sure how we missed it so far.To resolve this issue, the platform acknowledgment interrupt shouldAFAIC,always clearing the platform ack interrupt first which is also the
always be cleared before processing the interrupt for any notifications
or response.
communication flow as ACPI spec described.
I am not sure if it is ok when triggering interrupt and clearing interruptShould be OK as we start clearing all the channels that share, if the
occur concurrently.
handler doesn't clear any source, the interrupt must remain asserted.
But this scenario is always possible. I think It doesn't matter with thisIndeed, it can happen any time as you mentioned. No worries better to ask
patch. It's just my confusion.
and clarify than assume. Thanks for your time and review.
--
Regards,
Sudeep
.