Re: BUG in mmc: core: Disable card detect during shutdown
From: Ulf Hansson
Date: Tue Jun 07 2022 - 07:08:29 EST
On Sat, 4 Jun 2022 at 12:16, H. Nikolaus Schaller <hns@xxxxxxxxxxxxx> wrote:
>
> Hi,
>
> > Am 03.06.2022 um 12:46 schrieb Ulf Hansson <ulf.hansson@xxxxxxxxxx>:
> >
> > On Mon, 30 May 2022 at 18:55, H. Nikolaus Schaller <hns@xxxxxxxxxxxxx> wrote:
> >>
> >> Hi Ulf,
> >> users did report a strange issue that the OMAP5 based Pyra does not
> >> shutdown if a kernel 5.10.116 is used.
> >>
>
> ...
>
> >> mmc_stop_host() is not called but __mmc_stop_host() is called 4 times.
> >> There are 4 active MMC interfaces in the Pyra - 3 for (µ)SD slots
> >> and one for an SDIO WLAN module.
> >>
> >> Now it looks as if 3 of them are properly teared down (two of them
> >> seem to have host->slot.cd_irq >= 0) but on the fourth call
> >> cancel_delayed_work_sync(&host->detect); does not return. This is
> >> likely the location of the stall why we don't see a "reboot: Power down"
> >>
> >> Any ideas?
> >
> > I guess the call to cancel_delayed_work_sync() in __mmc_stop_host()
> > hangs for one of the mmc hosts. This shouldn't happen - and indicates
> > that there is something else being wrong.
>
> Yes, you were right...
>
> >
> > See more suggestions below.
> >
> >>
> >> BR and thanks,
> >> Nikolaus
> >>
> >> printk hack:
> >>
> >> void __mmc_stop_host(struct mmc_host *host)
> >> {
> >> printk("%s 1\n", __func__);
> >> if (host->slot.cd_irq >= 0) {
> >> printk("%s 2\n", __func__);
> >> mmc_gpio_set_cd_wake(host, false);
> >> printk("%s 3\n", __func__);
> >> disable_irq(host->slot.cd_irq);
> >> printk("%s 4\n", __func__);
> >> }
> >>
> >> host->rescan_disable = 1;
> >> printk("%s 5\n", __func__);
> >
> > My guess is that it's the same mmc host that causes the hang. I
> > suggest you print the name of the host too, to verify that. Something
> > along the lines of the below.
> >
> > printk("%s: %s 5\n", mmc_hostname(host), __func__);
>
> To my surprise, this did report an mmc6 host port where the OMAP5 only has 4...
>
> Yes, we have a special driver for the txs02612 sdio switch and voltage translator
> chip to make two ports out of the single mmc2 port of the OMAP5 SoC.
>
> This driver was begun ca. 7 years ago but never finished...
>
> The idea is to make a mmc port have several subports. For the Pyra handheld hardware
> we needed 5 mmc/sdio interfaces but the omap5 only has 4 of them available to us.
>
> So the txs02612 drivers is sitting between the omap5 mmc2 host pins and switches
> between an µSD slot and an eMMC.
>
> Therefore, the driver is a mmc client driver (like e.g. the driver of some WiFi chip
> connected to some SDIO port) and provides multiple mmc host interfaces.
>
> It should intercept data transfer requests to its multiple mmc hosts, synchronize
> (or enqueue) them, control the switch gpio and forward requests to the processor's
> mmc host port so that they are processed (after switching).
>
> We never continued to make this work...
Well, I can imagine that it's just very difficult to make this work properly.
Moreover, the mmc core and its block layer code isn't designed to
support this type of configuration. For example, the I/O scheduling
can't work with this setup.
>
> What remained is simple code to manually throw the switch through some /sysfs
> control file after doing an eject and before a fresh partprobe.
>
> Still, the probe function of the txs02612 driver does two calls to mmc_add_host().
> These seem to make
>
> >
> >> cancel_delayed_work_sync(&host->detect);
>
> get stuck. Most likely because the initialization is not complete for handling
> card detection.
>
> >>
> >> --- here should be another __mmc_stop_host 6
> >> --- and reboot: Power down
> >
> > When/if you figured out that it's the same host that hangs, you could
> > try to disable that host through the DTS files (add status =
> > "disabled" in the device node, for example) - and see if that works.
>
> When not calling mmc_add_host() in our txs02612 driver fragment we can
> properly shut down the OMAP5. That is the solution with the least efforts.
> The other would be to make the txs02612 properly work...
>
> So in summary there is no bug upstream. It is in our tree.
Thanks for sharing the details.
>
> If you are interested in how our code fragment for the txs02612 looks like:
>
> https://git.goldelico.com/?p=letux-kernel.git;a=shortlog;h=refs/heads/letux/txs02612
>
> Maybe you have some suggestions to make it work?
Sorry, but I have lots of things to do at this point, maybe some other time.
Kind regards
Uffe