Re: [PATCH 3/5] ipmi: ssif_bmc: fix message desynchronization after truncated response
From: Jian Zhang
Date: Fri Apr 03 2026 - 04:17:00 EST
> 2026年4月3日 15:22,Quan Nguyen <quan@xxxxxxxxxxxxxxxxxxxxxx> 写道:
>
>
>
> On 2/4/26 18:04, Jian Zhang wrote:
>> A truncated response, caused by host power-off, or other conditions,
>> can lead to message desynchronization.
>> Raw trace data (STOP loss scenario, add state transition comment):
>> 1. T-1: Read response phase (SSIF_RES_SENDING)
>> 8271.955342 WR_RCV [03] <- Read polling cmd
>> 8271.955348 RD_REQ [04] <== SSIF_RES_SENDING <- start sending response
>> 8271.955436 RD_PRO [b4]
>> 8271.955527 RD_PRO [00]
>> 8271.955618 RD_PRO [c1]
>> 8271.955707 RD_PRO [00]
>> 8271.955814 RD_PRO [ad] <== SSIF_RES_SENDING <- last byte
>> <- !! STOP lost (truncated response)
>
> Honestly, I have a little concern about if there is the case. What I think is that if there was no ACK (SCA is not pull low by Controller) on the 9th clock pulse while Target sending data to bus. Target will release SDA line, eventually, there will be a STOP condition and a SLAVE_STOP event should be emitted.
The concern is valid under normal I2C operation where the controller
continues to drive the clock and issues a NACK, which guarantees a STOP
condition.
However, the scenario described here is different:
- The controller may be powered off or reset abruptly
- SCL may stop toggling before the 9th clock
- As a result, no ACK/NACK phase is completed
- Therefore, no STOP condition is generated on the bus
In such cases, the target state machine may remain in
SSIF_RES_SENDING without receiving a SLAVE_STOP event,
leading to desynchronization.
Additionally, based on my previous experience with MCTP over I2C,
during events like I2C_SLAVE_READ_REQUESTED, I2C_SLAVE_WRITE_REQUESTED, I2C_SLAVE_READ_PROCESSED, and I2C_SLAVE_WRITE_RECEIVED,
various abnormal conditions—such as power loss, firmware bugs, or I2C bus hangs—can interrupt the signaling.
Therefore, we cannot assume that a STOP condition will always be observed.
The current change addresses this by allowing the state machine, when in SSIF_ABORTING,
to handle a newly detected command request and proceed accordingly.
- Jian
>
> - Quan