Re: [PATCH 1/1] driver core: Fix unbalance probe_count in really_probe()

From: Geert Uytterhoeven
Date: Wed Jun 03 2020 - 04:30:58 EST


Hi Ji-Ze,

On Wed, Jun 3, 2020 at 9:35 AM Ji-Ze Hong (Peter Hong) <hpeter@xxxxxxxxx> wrote:
> Geert Uytterhoeven æ 2020/6/3 äå 03:13 åé:
> > If devres_head is not empty, you have a serious problem on your system,
> > as those resources may be in an unknown state (e.g. freed but still in
> > use). While I had missed the probe_count imbalance when implementing
> > the original change, it may actually be safer to not decrease
> > probe_count, to prevent further probes from happening. But I guess it
> > doesn't matter: if you get here, your system is in a bad state anyway.
>
> We want to fix the shutdown/reboot freeze issue and bisect to this
> patch and found if the probe_count != 0, the PC will stuck with
> wait_for_device_probe() with shutdown/reboot forever. So we just
> change the increment after return -EBUSY.

IC. And before my change, you got a big fat warning backtrace, telling you
something is seriously wrong? ;-)

> In this case, it maybe 8250_PNP & serial 8250 platform driver resources
> conflict. I'll try to dump more message to debug.

OK.

> IMO, the shutdown/reboot operation should not block.

Well, it depends. If there's an issue with resources, the system may crash,
too.

> >> with serial8250 platform driver. e.g. AOPEN DE6200. The conflict boot
> >> dmesg below:
> >>
> >> Serial: 8250/16550 driver, 32 ports, IRQ sharing enabled
> >> 00:03: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 921600) is a 16550A
> >> 00:04: ttyS1 at I/O 0x2f8 (irq = 3, base_baud = 921600) is a 16550A
> >> 00:05: ttyS2 at I/O 0x3e8 (irq = 5, base_baud = 921600) is a 16550A
> >> serial8250: ttyS3 at I/O 0x2e8 (irq = 3, base_baud = 921600) is a 16550A
> >>
> >> Reboot/Shutdown will freeze in wait_for_device_probe(), message as
> >> following:
> >> INFQ: task systemd-shutdown: 1 blocked for more than 120 seconds.
> >
> > Now, how did you get to this state, i.e. which driver triggered the
> > "Resources present before probing" message? Because that is the root
> > issue that must be fixed, and the probe_count imbalance is IMHO just a
> > red herring.
> >
>
> Sorry for lost important dmesg:
>
> Serial: 8250/16550 driver, 32 ports, IRQ sharing enabled
> 00:03: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 921600) is a 16550A
> 00:04: ttyS1 at I/O 0x2f8 (irq = 3, base_baud = 921600) is a 16550A
> 00:05: ttyS2 at I/O 0x3e8 (irq = 5, base_baud = 921600) is a 16550A
> serial8250: ttyS3 at I/O 0x2e8 (irq = 3, base_baud = 921600) is a 16550A
> platform serial8250: Resources present before probing
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

OK. So the serial8250 driver does something fishy.

When the warning triggered for me, it was due to a driver calling a devm_*()
function on a different device than the one being probed, cfr.
https://lore.kernel.org/r/alpine.DEB.2.21.1911201053330.25420@xxxxxxxxxxxxxx
which was fixed by commit 32085f25d7b68404 ("mdio_bus: don't use managed
reset-controller").

The serial8250 driver, or the subdriver for an SoC-specific variant, may
do something similar.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds