Re: TI: X15 the connected SSD is not detected on Linux next 20221006 tag
From: Serge Semin
Date: Mon Oct 17 2022 - 11:53:00 EST
On Mon, Oct 17, 2022 at 09:43:24AM +0200, Anders Roxell wrote:
> On Fri, 14 Oct 2022 at 16:06, Serge Semin
> <Sergey.Semin@xxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > On Fri, Oct 14, 2022 at 11:22:38AM +0200, Anders Roxell wrote:
> > > On Fri, 14 Oct 2022 at 09:53, Damien Le Moal
> > > <damien.lemoal@xxxxxxxxxxxxxxxxxx> wrote:
> > > >
> > > > On 10/14/22 16:31, Arnd Bergmann wrote:
> > > > > On Fri, Oct 14, 2022, at 2:22 AM, Damien Le Moal wrote:
> > > > >> On 10/14/22 07:07, Anders Roxell wrote:
> > > > >> [...]
> > > > >>>> 8)
> > > > >>>>> If reverting these patches restores the eSATA port on this board, then you need
> > > > >>>>> to fix the defconfig for that board.
> > > > >>>>
> > > > >>>> OTOH,
> > > > >>>> Anders, enabled the new config CONFIG_AHCI_DWC=y and tried but the
> > > > >>>> device failed to boot.
> > > > >>>
> > > > >>> I thought it would work with enabling CONFIG_AHCI_DWC=y, but it didn't...
> > > > >>
> > > > >> As mentioned in my previous reply to Naresh, this is a new driver added in
> > > > >> 6.1. Your board was working before so this should not be the driver needed
> > > > >> for it.
> > > > >>
> > > > >>> However, reverting patch 33629d35090f ("ata: ahci: Add DWC AHCI SATA
> > > > >>> controller support")
> > > > >>> from next-20221013 was a success, kernel booted and the 'mkfs.ext4' cmd was
> > > > >>> successful.
> > > > >>
> > > > >> Which is very strange... There is only one hunk in that commit that could
> > > > >> be considered suspicious:
> > > > >>
> > > > >> diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c
> > > > >> index 9b56490ecbc3..8f5572a9f8f1 100644
> > > > >> --- a/drivers/ata/ahci_platform.c
> > > > >> +++ b/drivers/ata/ahci_platform.c
> > > > >> @@ -80,9 +80,7 @@ static SIMPLE_DEV_PM_OPS(ahci_pm_ops, ahci_platform_suspend,
> > > > >> static const struct of_device_id ahci_of_match[] = {
> > > > >> { .compatible = "generic-ahci", },
> > > > >> /* Keep the following compatibles for device tree compatibility */
> > > > >> - { .compatible = "snps,spear-ahci", },
> > > > >> { .compatible = "ibm,476gtr-ahci", },
> > > > >> - { .compatible = "snps,dwc-ahci", },
> > > > >> { .compatible = "hisilicon,hisi-ahci", },
> > > > >> { .compatible = "cavium,octeon-7130-ahci", },
> > > > >> { /* sentinel */ }
> > > > >>
> > > > >> Is your board using one of these compatible string ?
> > > > >
> > > > > The x15 uses "snps,dwc-ahci". I would expect it to detect the device
> > > > > with the new driver if that is loaded, but it's possible that the
> > > > > driver does not work on all versions of the dwc-ahci hardware.
> > > > >
> > > > > Anders, can you provide the boot log from a boot with the new driver
> > > > > built in? There should be some messages from dwc-ahci about finding
> > > > > the device, but then not ultimately working.
> > > > >
> > > > > Depending on which way it goes wrong, the safest fallback for 6.1 is
> > > > > probably to move the "snps,spear-ahci" and "snps,dwc-ahci" compatible
> > > > > strings back into the old driver, and leave the new one only for
> > > > > the "baikal,bt1-ahci" implementation of it, until it has been
> > > > > successfully verified on TI am5/dra7, spear13xx and exynos.
> > > >
> > > > OK. So a fix patch until further tests/debug is completed would be this:
> > > >
> > > > diff --git a/drivers/ata/ahci_dwc.c b/drivers/ata/ahci_dwc.c
> > > > index 8fb66860db31..7a0cbab00843 100644
> > > > --- a/drivers/ata/ahci_dwc.c
> > > > +++ b/drivers/ata/ahci_dwc.c
> > > > @@ -469,8 +469,6 @@ static struct ahci_dwc_plat_data ahci_bt1_plat = {
> > > > };
> > > >
> > > > static const struct of_device_id ahci_dwc_of_match[] = {
> > > > - { .compatible = "snps,dwc-ahci", &ahci_dwc_plat },
> > > > - { .compatible = "snps,spear-ahci", &ahci_dwc_plat },
> > > > { .compatible = "baikal,bt1-ahci", &ahci_bt1_plat },
> > > > {},
> > > > };
> > > > diff --git a/drivers/ata/ahci_platform.c b/drivers/ata/ahci_platform.c
> > > > index 8f5572a9f8f1..9b56490ecbc3 100644
> > > > --- a/drivers/ata/ahci_platform.c
> > > > +++ b/drivers/ata/ahci_platform.c
> > > > @@ -80,7 +80,9 @@ static SIMPLE_DEV_PM_OPS(ahci_pm_ops, ahci_platform_suspend,
> > > > static const struct of_device_id ahci_of_match[] = {
> > > > { .compatible = "generic-ahci", },
> > > > /* Keep the following compatibles for device tree compatibility */
> > > > + { .compatible = "snps,spear-ahci", },
> > > > { .compatible = "ibm,476gtr-ahci", },
> > > > + { .compatible = "snps,dwc-ahci", },
> > > > { .compatible = "hisilicon,hisi-ahci", },
> > > > { .compatible = "cavium,octeon-7130-ahci", },
> > > > { /* sentinel */ }
> > > >
> > > > Anders, Naresh,
> > > >
> > > > Can you try this ?
> > >
> >
> > > Tested this patch on todays linux-next tag: next-20221014 without enabling
> > > CONFIG_AHCI_DWC and it worked as expected when booting [1].
> > > On the other hand I also tried a build/boot with CONFIG_AHCI_DWC enabled
> > > and it worked as expected to boot [2].
> >
> > Expected result. The DWC driver will probe the device on our platform
> > only while your platform falls back to using the generic driver.
> > Anders, in order understand the root cause of the problem could you please
> > 1. upload the bogus boot log.
>
> This [1] is the bogus boot log.
>
> > 2. try what I suggested here
> > Link: https://lore.kernel.org/linux-ide/20221014133623.l6w4o7onoyhv2q34@mobilestation/
> > and if the system fails to boot at some point upload the boot log.
>
> Only doing this:
>
> --- a/drivers/ata/ahci_dwc.c
> +++ b/drivers/ata/ahci_dwc.c
> @@ -316,12 +316,13 @@ static int ahci_dwc_init_host(struct
> ahci_host_priv *hpriv)
> if (rc)
> goto err_disable_resources;
> }
> -
> +/*
> ahci_dwc_check_cap(hpriv);
>
> ahci_dwc_init_timer(hpriv);
>
> rc = ahci_dwc_init_dmacr(hpriv);
> +*/
> if (rc)
> goto err_clear_platform;
>
> and enable CONFIG_AHCI_DWC made the mkfs to detect the SATA drive [2].
Judging by what is in [1] and [2] I have much doubt that [1] was
executed with the CONFIG_AHCI_DWC config enabled because the boot log has
nothing about the ahci-dwc driver probe failure or none of the logs
messages seen in [2] (see every line with the ahci-dwc word in it).
1. If you had the device probe procedure failed at some point you
would have got a line like this:
< ahci-dwc: probe of 4a140000.sata failed with error -errno
But there is no such line in [1]. There is literally nothing
AHCI/SATA/SCSI/DWC AHCI/ahci-dwc/etc in it.
2. If you had the DW AHCI device probe at least performed, then the next
calls-chain would have been executed:
ahci_dwc_probe()
+-> ahci_dwc_get_resources()
+-> ahci_platform_get_resources()
+-> ...
+-> devm_regulator_get(...)
+-> ...
which would have caused the next log messages:
< [] ahci-dwc 4a140000.sata: supply ahci not found, using dummy regulator
< [] ahci-dwc 4a140000.sata: supply phy not found, using dummy regulator
< [] ahci-dwc 4a140000.sata: supply target not found, using dummy regulator
You do have these lines in [2] but missing them in [1]. Should you
have any errors in ahci_dwc_probe() detected before that you would
have an error printed as I noted in 1.
3. Should the problem was in the commented out code lines you would
have at least got the messages above printed to the log [1] because
the commented out code is executed after the resources request
procedure (see the ahci_dwc_init_host() method is called after
ahci_dwc_get_resources()).
4. Finally the commented out code doesn't really do any actions which
could have caused the device probe to silently halt.
All of that makes me thinking that the DW AHCI SATA wasn't even probed
in [1] which most likely means that either the driver config was
omitted there or the device was disabled. So could you please re-start
the system like in [2] but uncomment the lines above?
* Please make sure the Damien's fix
https://www.spinics.net/lists/arm-kernel/msg1017920.html
isn't applied on the kernel [2].
[1] https://lkft.validation.linaro.org/scheduler/job/5634743#L2580
[2] https://lkft.validation.linaro.org/scheduler/job/5679278#L2617
-Sergey
>
> Cheers,
> Anders
> [1] https://lkft.validation.linaro.org/scheduler/job/5634743#L2580
> [2] https://lkft.validation.linaro.org/scheduler/job/5679278#L2617