Re: [PATCH 1/2] i2c: smbus: Improve handling of stuck alerts

From: Wolfram Sang
Date: Sun Jul 28 2024 - 16:01:37 EST


On Mon, Jan 10, 2022 at 09:28:56AM -0800, Guenter Roeck wrote:
> The following messages were observed while testing alert functionality
> on systems with multiple I2C devices on a single bus if alert was active
> on more than one chip.
>
> smbus_alert 3-000c: SMBALERT# from dev 0x0c, flag 0
> smbus_alert 3-000c: no driver alert()!
>
> and:
>
> smbus_alert 3-000c: SMBALERT# from dev 0x28, flag 0
>
> Once it starts, this message repeats forever at high rate. There is no
> device at any of the reported addresses.
>
> Analysis shows that this is seen if multiple devices have the alert pin
> active. Apparently some devices do not support SMBus arbitration correctly.
> They keep sending address bits after detecting an address collision and
> handle the collision not at all or too late.
> Specifically, address 0x0c is seen with ADT7461A at address 0x4c and
> ADM1021 at address 0x18 if alert is active on both chips. Address 0x28 is
> seen with ADT7483 at address 0x2a and ADT7461 at address 0x4c if alert is
> active on both chips.
>
> Once the system is in bad state (alert is set by more than one chip),
> it often only recovers by power cycling.
>
> To reduce the impact of this problem, abort the endless loop in
> smbus_alert() if the same address is read more than once and not
> handled by a driver.
>
> Fixes: b5527a7766f0 ("i2c: Add SMBus alert support")
> Signed-off-by: Guenter Roeck <linux@xxxxxxxxxxxx>

It fixed the interrupt storm for me and I like the approach. I'd apply
it to for-current once rc1 is released. Two minor changes, though.

> ---
> drivers/i2c/i2c-smbus.c | 30 ++++++++++++++++++++++++------
> 1 file changed, 24 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/i2c/i2c-smbus.c b/drivers/i2c/i2c-smbus.c
> index d3d06e3b4f3b..533c885b99ac 100644
> --- a/drivers/i2c/i2c-smbus.c
> +++ b/drivers/i2c/i2c-smbus.c
> @@ -34,6 +34,7 @@ static int smbus_do_alert(struct device *dev, void *addrp)
> struct i2c_client *client = i2c_verify_client(dev);
> struct alert_data *data = addrp;
> struct i2c_driver *driver;
> + int ret;
>
> if (!client || client->addr != data->addr)
> return 0;
> @@ -47,16 +48,21 @@ static int smbus_do_alert(struct device *dev, void *addrp)
> device_lock(dev);
> if (client->dev.driver) {
> driver = to_i2c_driver(client->dev.driver);
> - if (driver->alert)
> + if (driver->alert) {
> driver->alert(client, data->type, data->data);
> - else
> + ret = -EBUSY;
> + } else {
> dev_warn(&client->dev, "no driver alert()!\n");
> - } else
> + ret = -EOPNOTSUPP;
> + }
> + } else {
> dev_dbg(&client->dev, "alert with no driver\n");
> + ret = -ENODEV;
> + }
> device_unlock(dev);
>
> /* Stop iterating after we find the device */

I moved this comment up where -EBUSY is assigned to 'ret'.

> - return -EBUSY;
> + return ret;
> }
>
> /*
> @@ -67,6 +73,7 @@ static irqreturn_t smbus_alert(int irq, void *d)
> {
> struct i2c_smbus_alert *alert = d;
> struct i2c_client *ara;
> + unsigned short prev_addr = 0; /* Not a valid address */

I used I2C_CLIENT_END as an invalid address. We use it for the same
purpose in other parts of the core as well.

>
> ara = alert->ara;
>
> @@ -94,8 +101,19 @@ static irqreturn_t smbus_alert(int irq, void *d)
> data.addr, data.data);
>
> /* Notify driver for the device which issued the alert */
> - device_for_each_child(&ara->adapter->dev, &data,
> - smbus_do_alert);
> + status = device_for_each_child(&ara->adapter->dev, &data,
> + smbus_do_alert);
> + /*
> + * If we read the same address more than once, and the alert
> + * was not handled by a driver, it won't do any good to repeat
> + * the loop because it will never terminate.
> + * Bail out in this case.
> + * Note: This assumes that a driver with alert handler handles
> + * the alert properly and clears it if necessary.
> + */
> + if (data.addr == prev_addr && status != -EBUSY)
> + break;
> + prev_addr = data.addr;
> }
>
> return IRQ_HANDLED;
> --
> 2.33.0
>

Attachment: signature.asc
Description: PGP signature