Re: [PATCH] power: supply: bq27xxx_battery: Do not return ENODEV when busy

From: Pali Rohár
Date: Sat Sep 14 2024 - 04:25:04 EST


Hello Jerry,

I think that this issue should be handled in different way.

First thing is to propagate error and not change it to -ENODEV. This is
really confusing and makes debugging harder.

Second thing, if bq27xxx_read() returns -EBUSY, sleep few milliseconds
and call bq27xxx_read() again.

This should cover the issue which you are observing and also fixing the
problem which you introduced in your change (interpreting error code as
bogus cache data).

Anyway, which bus is BQ27Z561-R2 using (i2c?)? And how is EBUSY
indicated or transferred over wire?

Pali

On Saturday 14 September 2024 02:57:39 Jerry Lv wrote:
> Hi Pali,
>
> (Sorry for inconvineient! previous email was rejected by some email list for some HTML part, so I edit it and send it again.)
>
> Yes, bq27xxx_read() will return -EBUSY, and bq27xxx_read() will be called in many functions.
>
> In our product, some different applications may access the gauge BQ27Z561-R2, and we see many times the returned error code is -ENODEV.
> After debugging it by oscillograph and adding some debug info, we found the device is busy sometimes, and it will recover very soon(a few milliseconds).
> So, we want to exclude the busy case before return -ENODEV.
>
> Best Regards,
> Jerry
>
> On Friday 13 September 2024 16:45:37 Jerry Lv wrote:
> > Multiple applications may access the device gauge at the same time, so the
> > gauge may be busy and EBUSY will be returned. The driver will set a flag to
> > record the EBUSY state, and this flag will be kept until the next periodic
> > update. When this flag is set, bq27xxx_battery_get_property() will just
> > return ENODEV until the flag is updated.
>
> I did not find any evidence of EBUSY. Which function and to which caller
> it returns? Do you mean that bq27xxx_read() returns -EBUSY?
>
> > Even if the gauge was busy during the last accessing attempt, returning
> > ENODEV is not ideal, and can cause confusion in the applications layer.
>
> It would be better to either propagate correct error or return old value
> from cache...
>
> > Instead, retry accessing the gauge to update the properties is as expected.
> > The gauge typically recovers from busy state within a few milliseconds, and
> > the cached flag will not cause issues while updating the properties.
> >
> > Signed-off-by: Jerry Lv <Jerry.Lv@xxxxxxxx>
> > ---
> > drivers/power/supply/bq27xxx_battery.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/power/supply/bq27xxx_battery.c b/drivers/power/supply/bq27xxx_battery.c
> > index 750fda543308..eefbb5029a3b 100644
> > --- a/drivers/power/supply/bq27xxx_battery.c
> > +++ b/drivers/power/supply/bq27xxx_battery.c
> > @@ -2029,7 +2029,7 @@ static int bq27xxx_battery_get_property(struct power_supply *psy,
> > bq27xxx_battery_update_unlocked(di);
> > mutex_unlock(&di->lock);
> >
> > - if (psp != POWER_SUPPLY_PROP_PRESENT && di->cache.flags < 0)
> > + if (psp != POWER_SUPPLY_PROP_PRESENT && di->cache.flags < 0 && di->cache.flags != -EBUSY)
> > return -ENODEV;
>
> ... but ignoring error and re-using the error return value as flags in
> code later in this function is bad idea.
>
> >
> > switch (psp) {
> >
> > ---
> > base-commit: da3ea35007d0af457a0afc87e84fddaebc4e0b63
> > change-id: 20240913-foo-fix2-a0d79db86a0b
> >
> > Best regards,
> > --
> > Jerry Lv <Jerry.Lv@xxxxxxxx>
> >
>