Re: [PATCHv1 00/19] Improve SBS battery support

From: Sebastian Reichel
Date: Mon Jun 01 2020 - 13:05:35 EST


Hi Marek,

On Mon, Jun 01, 2020 at 12:40:27PM +0200, Marek Szyprowski wrote:
> On 13.05.2020 20:55, Sebastian Reichel wrote:
> > This patchset improves support for SBS compliant batteries. Due to
> > the changes, the battery now exposes 32 power supply properties and
> > (un)plugging it generates a backtrace containing the following message
> > without the first patch in this series:
> >
> > ---------------------------
> > WARNING: CPU: 0 PID: 20 at lib/kobject_uevent.c:659 add_uevent_var+0xd4/0x104
> > add_uevent_var: too many keys
> > ---------------------------
> >
> > For references this is what an SBS battery status looks like after
> > the patch series has been applied:
> >
> > cat /sys/class/power_supply/sbs-0-000b/uevent
> > POWER_SUPPLY_NAME=sbs-0-000b
> > POWER_SUPPLY_TYPE=Battery
> > POWER_SUPPLY_STATUS=Discharging
> > POWER_SUPPLY_CAPACITY_LEVEL=Normal
> > POWER_SUPPLY_HEALTH=Good
> > POWER_SUPPLY_PRESENT=1
> > POWER_SUPPLY_TECHNOLOGY=Li-ion
> > POWER_SUPPLY_CYCLE_COUNT=12
> > POWER_SUPPLY_VOLTAGE_NOW=11441000
> > POWER_SUPPLY_CURRENT_NOW=-26000
> > POWER_SUPPLY_CURRENT_AVG=-24000
> > POWER_SUPPLY_CAPACITY=76
> > POWER_SUPPLY_CAPACITY_ERROR_MARGIN=1
> > POWER_SUPPLY_TEMP=198
> > POWER_SUPPLY_TIME_TO_EMPTY_AVG=438600
> > POWER_SUPPLY_TIME_TO_FULL_AVG=3932100
> > POWER_SUPPLY_SERIAL_NUMBER=0000
> > POWER_SUPPLY_VOLTAGE_MIN_DESIGN=10800000
> > POWER_SUPPLY_VOLTAGE_MAX_DESIGN=10800000
> > POWER_SUPPLY_ENERGY_NOW=31090000
> > POWER_SUPPLY_ENERGY_FULL=42450000
> > POWER_SUPPLY_ENERGY_FULL_DESIGN=41040000
> > POWER_SUPPLY_CHARGE_NOW=2924000
> > POWER_SUPPLY_CHARGE_FULL=3898000
> > POWER_SUPPLY_CHARGE_FULL_DESIGN=3800000
> > POWER_SUPPLY_CONSTANT_CHARGE_CURRENT_MAX=3000000
> > POWER_SUPPLY_CONSTANT_CHARGE_VOLTAGE_MAX=12300000
> > POWER_SUPPLY_MANUFACTURE_YEAR=2017
> > POWER_SUPPLY_MANUFACTURE_MONTH=7
> > POWER_SUPPLY_MANUFACTURE_DAY=3
> > POWER_SUPPLY_MANUFACTURER=UR18650A
> > POWER_SUPPLY_MODEL_NAME=GEHC
>
> This patch landed in linux-next dated 20200529. Sadly it causes a
> regression on Samsung Exynos-based Chromebooks (Exynos5250 Snow,
> Exynos5420 Peach-Pi and Exynos5800 Peach-Pit). System boots to
> userspace, but then, when udev populates /dev, booting hangs:
>
> [    4.435167] VFS: Mounted root (ext4 filesystem) readonly on device
> 179:51.
> [    4.457477] devtmpfs: mounted
> [    4.460235] Freeing unused kernel memory: 1024K
> [    4.464022] Run /sbin/init as init process
> INIT: version 2.88 booting
> [info] Using makefile-style concurrent boot in runlevel S.
> [    5.102096] random: crng init done
> [....] Starting the hotplug events dispatcher: systemd-udevdstarting
> version 236
> [ ok .
> [....] Synthesizing the initial hotplug events...[ ok done.
> [....] Waiting for /dev to be fully populated...[   34.409914]
> TPS65090_RAILSDCDC1: disabling
> [   34.412977] TPS65090_RAILSDCDC2: disabling
> [   34.417021] TPS65090_RAILSDCDC3: disabling
> [   34.423848] TPS65090_RAILSLDO1: disabling
> [   34.429068] TPS65090_RAILSLDO2: disabling

:(

log does not look useful either.

> Bisect between v5.7-rc1 and next-20200529 pointed me to the first bad
> commit: [c4b12a2f3f3de670f6be5e96092a2cab0b877f1a] power: supply:
> sbs-battery: simplify read_read_string_data.

ok. I tested this on an to-be-upstreamed i.MX6 based system
and arch/arm/boot/dts/imx53-ppd.dts. I think the difference
is, that i2c-exynos5 does not expose I2C_FUNC_SMBUS_READ_BLOCK_DATA.
I hoped all systems using SBS battery support this, but now
I see I2C_FUNC_SMBUS_EMUL only supports writing block data.
Looks like I need to add another patch implementing that
using the old code with added PEC support.

In any case that should only return -ENODEV for the property
(and uevent), but not break boot. So something fishy is going
on.

> However reverting it in linux-next doesn't fix the issue, so the
> next commits are also relevant to this issue.

The next patch, which adds PEC support depends on the simplification
of sbs_read_string_data. The old, open coded variant will result in
PEC failure for string properties (which should not stop boot either
of course). Can you try reverting both?

If that helps I will revert those two instead of dropping the whole
series for this merge window.

> Let me know how can I help debugging it.

I suspect, that this is userspace endlessly retrying reading the
battery uevent when an error is returned. Could you check this?
Should be easy to see by adding some printfs.

That would mean a faulty battery could stall complete boot without
a useful error message, which is bad and needs to be fixed.

Sorry for the inconvience and thanks for your report,

-- Sebastian

Attachment: signature.asc
Description: PGP signature