Re: [PATCH] power: supply: bq27xxx_battery: Do not return ENODEV when busy

From: Pali Rohár
Date: Mon Sep 23 2024 - 14:16:49 EST


Thank you for detailed information about i2c NAK. In this case try to
consider if it would not be better to add retry logic in the
bq27xxx_battery_i2c_read() function.

If it is common that bq chipset itself returns i2c NAKs during normal
operations then this affects any i2c read operation done by
bq27xxx_battery_i2c_read() function.

So this issue is not related just to reading "flags", but to anything.
That is why I think that retry should be handled at lower layer.

On Monday 23 September 2024 08:14:13 Jerry Lv wrote:
> Hi Pali,
>
> Thanks for your excellent suggestion, I will change the code accordingly.
>
> About the question:
> Anyway, which bus is BQ27Z561-R2 using (i2c?)? And how is EBUSY indicated or transferred over wire?
> --- Yes, we connect the gauge BQ27Z561 to I2C. When it's busy, the feedback we got from the logic analyser is "NAK".
>
>
> Best Regards,
> Jerry Lv
>
> ________________________________________
> From: Pali Rohár <pali@xxxxxxxxxx>
> Sent: Saturday, September 14, 2024 4:24 PM
> To: Jerry Lv
> Cc: Sebastian Reichel; linux-pm@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Kernel
> Subject: Re: [PATCH] power: supply: bq27xxx_battery: Do not return ENODEV when busy
>
> Hello Jerry,
>
> I think that this issue should be handled in different way.
>
> First thing is to propagate error and not change it to -ENODEV. This is
> really confusing and makes debugging harder.
>
> Second thing, if bq27xxx_read() returns -EBUSY, sleep few milliseconds
> and call bq27xxx_read() again.
>
> This should cover the issue which you are observing and also fixing the
> problem which you introduced in your change (interpreting error code as
> bogus cache data).
>
> Anyway, which bus is BQ27Z561-R2 using (i2c?)? And how is EBUSY
> indicated or transferred over wire?
>
> Pali
>
> On Saturday 14 September 2024 02:57:39 Jerry Lv wrote:
> > Hi Pali,
> >
> > (Sorry for inconvineient! previous email was rejected by some email list for some HTML part, so I edit it and send it again.)
> >
> > Yes, bq27xxx_read() will return -EBUSY, and bq27xxx_read() will be called in many functions.
> >
> > In our product, some different applications may access the gauge BQ27Z561-R2, and we see many times the returned error code is -ENODEV.
> > After debugging it by oscillograph and adding some debug info, we found the device is busy sometimes, and it will recover very soon(a few milliseconds).
> > So, we want to exclude the busy case before return -ENODEV.
> >
> > Best Regards,
> > Jerry
> >
> > On Friday 13 September 2024 16:45:37 Jerry Lv wrote:
> > > Multiple applications may access the device gauge at the same time, so the
> > > gauge may be busy and EBUSY will be returned. The driver will set a flag to
> > > record the EBUSY state, and this flag will be kept until the next periodic
> > > update. When this flag is set, bq27xxx_battery_get_property() will just
> > > return ENODEV until the flag is updated.
> >
> > I did not find any evidence of EBUSY. Which function and to which caller
> > it returns? Do you mean that bq27xxx_read() returns -EBUSY?
> >
> > > Even if the gauge was busy during the last accessing attempt, returning
> > > ENODEV is not ideal, and can cause confusion in the applications layer.
> >
> > It would be better to either propagate correct error or return old value
> > from cache...
> >
> > > Instead, retry accessing the gauge to update the properties is as expected.
> > > The gauge typically recovers from busy state within a few milliseconds, and
> > > the cached flag will not cause issues while updating the properties.
> > >
> > > Signed-off-by: Jerry Lv <Jerry.Lv@xxxxxxxx>
> > > ---
> > > drivers/power/supply/bq27xxx_battery.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/power/supply/bq27xxx_battery.c b/drivers/power/supply/bq27xxx_battery.c
> > > index 750fda543308..eefbb5029a3b 100644
> > > --- a/drivers/power/supply/bq27xxx_battery.c
> > > +++ b/drivers/power/supply/bq27xxx_battery.c
> > > @@ -2029,7 +2029,7 @@ static int bq27xxx_battery_get_property(struct power_supply *psy,
> > > bq27xxx_battery_update_unlocked(di);
> > > mutex_unlock(&di->lock);
> > >
> > > - if (psp != POWER_SUPPLY_PROP_PRESENT && di->cache.flags < 0)
> > > + if (psp != POWER_SUPPLY_PROP_PRESENT && di->cache.flags < 0 && di->cache.flags != -EBUSY)
> > > return -ENODEV;
> >
> > ... but ignoring error and re-using the error return value as flags in
> > code later in this function is bad idea.
> >
> > >
> > > switch (psp) {
> > >
> > > ---
> > > base-commit: da3ea35007d0af457a0afc87e84fddaebc4e0b63
> > > change-id: 20240913-foo-fix2-a0d79db86a0b
> > >
> > > Best regards,
> > > --
> > > Jerry Lv <Jerry.Lv@xxxxxxxx>
> > >
> >