Re: [PATCH v3 3/3] i3c: master: Validate GET CCC payload length and retry M0/M2 once

From: Adrian Hunter

Date: Wed Jun 24 2026 - 08:23:02 EST


On 24/06/2026 11:28, NG, TZE YEE wrote:
> On 23/6/2026 8:29 pm, Adrian Hunter wrote:
>> On 19/06/2026 11:09, NG, TZE YEE wrote:
>>> On 16/6/2026 3:41 pm, Adrian Hunter wrote:
>>>> [You don't often get email from adrian.hunter@xxxxxxxxx. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
>>>>
>>>> On 11/06/2026 04:54, tze.yee.ng@xxxxxxxxxx wrote:
>>>>> From: Adrian Ng Ho Yin <adrian.ho.yin.ng@xxxxxxxxxx>
>>>>>
>>>>> Validate GET CCC payload length after a successful transfer. Treat a
>>>>> short read as I3C_ERROR_M0 and return -EIO. GETMRL accepts exactly 2 or
>>>>> 3 bytes per the I3C spec defined formats. GETMXDS may return 2 bytes
>>>>> (format 1) or 5 bytes (format 2) per I3C spec.
>>>>>
>>>>> Retry GET CCCs once on retriable errors: I3C_ERROR_M0 (frame error) and
>>>>> I3C_ERROR_M2 (address-header NACK, e.g. IBI or Controller Role Request
>>>> Some controller drivers do not set I3C_ERROR_M2 correctly:
>>>> svc-i3c-master seems to set I3C_ERROR_M2 on all errors
>>>> mipi-i3c-hci sets it also on error status 0x5: NACK: Address was NACK’ed
>>>>
>>>> Others need to be checked
>>>>
>>> I checked the same paths:
>>> - svc-i3c-master sets I3C_ERROR_M2 on any CCC failure.
>>> - mipi-i3c-hci maps RESP_ERR_NACK (0x5) to I3C_ERROR_M2.
>>> So a generic core retry on I3C_ERROR_M2 is only as good as each driver's
>>> error reporting. Our DW series maps specific hardware status bits to
>>> M0/M2, but we agree the wider driver behaviour should be reviewed before
>>> relying on M2 retry across all masters.
>>>
>>> We can either narrow this series to DW-only behaviour for now, or follow
>>> up with a separate audit/fix of M2 reporting in other drivers. Happy to
>>> take your preference.
>>
>> I am wondering if we should ignore the nature of the error and just
>> retry after any error.
>>
>> Also it seems like the number of retries should be passed by the caller,
>> so that different commands could be treated differently. There is now
>> i3c_master_i3c_dev_present() which does its own retries.
>>
> On retry: agreed — we will drop M0/M2-based retry and retry on any error
> where a caller decides it is appropriate.
> For retry count, we would like ask your preference between two approaches:
> (i) Sysfs ccc_retry_count on the I3C master (similar to the existing
> dev_nack_retry_count): default 1, user-tunable via sysfs. Callers that
> need retries would read this from the master controller (or a shared
> helper would).

Let's not add sysfs knobs to start with. If a platform or device needs
special treatment, it is better to configure that via device properties
or DMI quirks or some such.

> (ii) Caller-local constants (like your i3c_master_i3c_dev_present()
> pattern : each helper that needs retries defines its own MAX_CNT /
> backoff and loops around i3c_master_send_ccc_cmd_locked() — no new sysfs
> knob.
> Which do you prefer?

Could define a default:

#define I3C_CCC_RETRIES 1

Perhaps add i3c_ccc_cmd_init_retries() like i3c_ccc_cmd_init() but with
a retries parameter, and then i3c_ccc_cmd_init() just calls
i3c_ccc_cmd_init_retries(..., I3C_CCC_RETRIES)

Call sites that need a different number of retries can call
i3c_ccc_cmd_init_retries() instead of i3c_ccc_cmd_init()

While we are only doing 1 retry anyway, maybe let's not bother with a
backoff.

>>>>> arbitration per I3C spec section 5.1.2.2.3). SET CCCs are not retried
>>>>
>>>> What has section 5.1.2.2.3 got to do with I3C_ERROR_M2?
>>>>
>>> Section 5.1.2.2.3 describes the recovery behaviour for a transient
>>> address-header NACK (e.g. IBI/CRR arbitration), not the definition of
>>> I3C_ERROR_M2 itself.
>>>
>>> M2 is the address-header NACK error code; §5.1.2.2.3 is one case where
>>> the spec says software should re-issue the transfer. We will reword the
>>> commit message to make that distinction clear and drop the implication
>>> that M2 is defined in that section.
>>
>> The spec. has a different definition of M2:
>>
>> "If the Master does not receive an ACK of a transmitted Broadcast Address (7’h7E),"
>>
>> Whereas section 5.1.2.2.3 is about Frames starting with the target address.
>> That could be covered by DEV_NACK_RETRY_CNT (which we should change to
>> default to 1 not 0 by the way)
>>
> Agreed. M2 in spec is the broadcast-address (7'h7E) NACK case, not
> address-header NACK error code. I will reword the commit message in v4.
>
> For the DEV_NACK_RETRY_CNT default value, we will update it in separate
> patch.>>>> to avoid repeating side-effecting commands. Restore
> dests[].payload.len
>>>>> to the originally requested length before each attempt and again before
>>>>> returning an error, so callers that adjust the length on failure (e.g.
>>>>> i3c_master_getmxds_locked()) do not underflow a shortened value.
>>>>>
>>>>> Use a stack buffer for the common single-destination GET case and only
>>>>> kmalloc when ndests > 1.
>>>>>
>>>>> Signed-off-by: Adrian Ng Ho Yin <adrian.ho.yin.ng@xxxxxxxxxx>
>>>>> Signed-off-by: Tze Yee Ng <tze.yee.ng@xxxxxxxxxx>
>>>>> ---
>>>>> Changes in v3:
>>>>> - Drop the change that moves RESPONSE_ERROR_ADDRESS_NACK to default case
>>>>> in dw_i3c_master_end_xfer_locked(). Now dw_i3c_master_end_xfer_locked()
>>>>> returns -EIO for RESPONSE_ERROR_ADDRESS_NACK.
>>>>> ---
>>>>> drivers/i3c/master.c | 111 ++++++++++++++++++++++++++++++++++++++++++-
>>>>> 1 file changed, 110 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/i3c/master.c b/drivers/i3c/master.c
>>>>> index 5cd4e5da2233..c94d37cd8b3f 100644
>>>>> --- a/drivers/i3c/master.c
>>>>> +++ b/drivers/i3c/master.c
>>>>> @@ -26,6 +26,12 @@ static DEFINE_MUTEX(i3c_core_lock);
>>>>> static int __i3c_first_dynamic_bus_num;
>>>>> static BLOCKING_NOTIFIER_HEAD(i3c_bus_notifier);
>>>>>
>>>>> +#define I3C_CCC_GETMRL_LEN_SHORT 2
>>>>> +#define I3C_CCC_GETMRL_LEN_FULL 3
>>>>> +#define I3C_CCC_GETMXDS_LEN_SHORT 2
>>>>> +#define I3C_CCC_GETMXDS_LEN_FULL 5
>>>>> +#define I3C_CCC_MAX_RETRIES 2
>>>>> +
>>>>> /**
>>>>> * i3c_bus_maintenance_lock - Lock the bus for a maintenance operation
>>>>> * @bus: I3C bus to take the lock on
>>>>> @@ -925,6 +931,61 @@ static void i3c_ccc_cmd_init(struct i3c_ccc_cmd *cmd, bool rnw, u8 id,
>>>>> cmd->err = I3C_ERROR_UNKNOWN;
>>>>> }
>>>>>
>>>>> +static bool i3c_ccc_get_payload_ok(u8 id, u16 req_len, u16 actual_len)
>>>>> +{
>>>>> + if (actual_len > req_len)
>>>>> + return false;
>>>>> +
>>>>> + if (!req_len)
>>>>> + return actual_len == 0;
>>>>> +
>>>>> + if (id == I3C_CCC_GETMRL)
>>>>> + return actual_len == I3C_CCC_GETMRL_LEN_SHORT ||
>>>>> + actual_len == I3C_CCC_GETMRL_LEN_FULL;
>>>>> +
>>>>> + if (id == I3C_CCC_GETMXDS)
>>>>> + return actual_len == I3C_CCC_GETMXDS_LEN_SHORT ||
>>>>> + actual_len == I3C_CCC_GETMXDS_LEN_FULL;
>>>>
>>>> It would be better to contain individual CCC information in
>>>> the caller of i3c_master_send_ccc_cmd_locked(). Perhaps
>>>> add optional_bytes to struct i3c_ccc_cmd_payload:
>>>> For I3C_CCC_GETMRL, optional_bytes = 1
>>>> For I3C_CCC_GETMXDS, optional_bytes = 3
>>>>
>>> Agreed. The GETMRL/GETMXDS length rules are caller-specific and fit
>>> better at the call site than as hardcoded CCC IDs in
>>> i3c_master_send_ccc_cmd_locked().
>>>
>>> Setting optional_bytes in the caller (e.g. getmrl_locked() → 1,
>>> getmxds_locked() → 3). We can respin v4 with that approach.>> +
>>>>> + return actual_len == req_len;
>>>>> +}
>>>>> +
>>>>> +static int i3c_ccc_validate_payload_len(struct i3c_ccc_cmd *cmd,
>>>>> + const u16 *req_lens)
>>>>> +{
>>>>> + unsigned int i;
>>>>> +
>>>>> + if (!cmd->rnw)
>>>>> + return 0;
>>>>> +
>>>>> + for (i = 0; i < cmd->ndests; i++) {
>>>>> + u16 actual = cmd->dests[i].payload.len;
>>>>> + u16 req = req_lens[i];
>>>>> +
>>>>> + if (!i3c_ccc_get_payload_ok(cmd->id, req, actual)) {
>>>>> + cmd->err = I3C_ERROR_M0;
>>>>> + return -EIO;
>>>>> + }
>>>>> + }
>>>>> +
>>>>> + return 0;
>>>>> +}
>>>>> +
>>>>> +/*
>>>>> + * M0: transient frame errors.
>>>>> + * M2: address-header NACK (I3C spec section 5.1.2.2.3), e.g. when a target
>>>>> + * simultaneously asserts an IBI or Controller Role Request and neither
>>>>> + * side ACKs. Software should re-issue the transfer; the controller wins
>>>>> + * arbitration after Repeated START.
>>>>> + *
>>>>> + * Retries apply to GET CCCs only; SET CCCs are not retried to avoid
>>>>> + * repeating side-effecting commands.
>>>>> + */
>>>>> +static bool i3c_ccc_err_retriable(enum i3c_error_code err)
>>>>> +{
>>>>> + return err == I3C_ERROR_M0 || err == I3C_ERROR_M2;
>>>>> +}
>>>>> +
>>>>> /**
>>>>> * i3c_master_send_ccc_cmd_locked() - send a CCC (Common Command Codes)
>>>>> * @master: master used to send frames on the bus
>>>>> @@ -936,9 +997,17 @@ static void i3c_ccc_cmd_init(struct i3c_ccc_cmd *cmd, bool rnw, u8 id,
>>>>> static int i3c_master_send_ccc_cmd_locked(struct i3c_master_controller *master,
>>>>> struct i3c_ccc_cmd *cmd)
>>>>> {
>>>>> + u16 req_len;
>>>>> + u16 *req_lens = NULL;
>>>>> + u16 *req_lens_alloc = NULL;
>>>>> + unsigned int i;
>>>>> + int ret, retries;
>>>>> +
>>>>> if (!cmd || !master)
>>>>> return -EINVAL;
>>>>>
>>>>> + retries = cmd->rnw ? I3C_CCC_MAX_RETRIES : 1;
>>>>> +
>>>>> if (WARN_ON(master->init_done &&
>>>>> !rwsem_is_locked(&master->bus.lock)))
>>>>> return -EINVAL;
>>>>> @@ -953,7 +1022,47 @@ static int i3c_master_send_ccc_cmd_locked(struct i3c_master_controller *master,
>>>>> !master->ops->supports_ccc_cmd(master, cmd))
>>>>> return -EOPNOTSUPP;
>>>>>
>>>>> - return master->ops->send_ccc_cmd(master, cmd);
>>>>> + if (cmd->rnw && cmd->dests && cmd->ndests) {
>>>>> + if (cmd->ndests == 1) {
>>>>> + req_len = cmd->dests[0].payload.len;
>>>>> + req_lens = &req_len;
>>>>> + } else {
>>>>> + req_lens_alloc = kmalloc_array(cmd->ndests,
>>>>> + sizeof(*req_lens_alloc),
>>>>> + GFP_KERNEL);
>>>>
>>>> Simpler to add actual_len to struct i3c_ccc_cmd_payload and
>>>> amend controller drivers to use that.
>>>>
>>> Agreed. We will add actual_len to struct i3c_ccc_cmd_payload in v4.>> +
>>> if (!req_lens_alloc)
>>>>> + return -ENOMEM;
>>>>> +
>>>>> + req_lens = req_lens_alloc;
>>>>> + for (i = 0; i < cmd->ndests; i++)
>>>>> + req_lens[i] = cmd->dests[i].payload.len;
>>>>> + }
>>>>> + }
>>>>> +
>>>>> + do {
>>>>> + cmd->err = I3C_ERROR_UNKNOWN;
>>>>> + if (req_lens) {
>>>>> + for (i = 0; i < cmd->ndests; i++)
>>>>> + cmd->dests[i].payload.len = req_lens[i];
>>>>> + }
>>>>> + ret = master->ops->send_ccc_cmd(master, cmd);
>>>>> + if (!ret && req_lens)
>>>>> + ret = i3c_ccc_validate_payload_len(cmd, req_lens);
>>>>> + } while (--retries && ret && i3c_ccc_err_retriable(cmd->err));
>>>>> +
>>>>> + if (ret && req_lens) {
>>>>> + /*
>>>>> + * Drivers may update payload.len to the actual RX count;
>>>>> + * restore the requested length so callers can safely adjust
>>>>> + * it on error (e.g. i3c_master_getmxds_locked()).
>>>>> + */
>>>>> + for (i = 0; i < cmd->ndests; i++)
>>>>> + cmd->dests[i].payload.len = req_lens[i];
>>>>> + }
>>>>> +
>>>>> + kfree(req_lens_alloc);
>>>>> +
>>>>> + return ret;
>>>>> }
>>>>>
>>>>> static struct i2c_dev_desc *
>>>>
>>>
>>
>