Re: [Bisected Regression] OLPC XO-1.5: Internal drive and SD card (mmcblk*) gone since commit ea718c699055

From: Rob Herring
Date: Thu Sep 09 2021 - 19:10:01 EST


On Thu, Sep 9, 2021 at 2:24 PM Saravana Kannan <saravanak@xxxxxxxxxx> wrote:
>
> On Thu, Sep 9, 2021 at 8:15 AM Rob Herring <robh+dt@xxxxxxxxxx> wrote:
> >
> > On Thu, Sep 9, 2021 at 9:09 AM Andre Muller <andre.muller@xxxxxx> wrote:
> > >
> > > On 09/09/2021 00.31, Rob Herring wrote:
> > > > On Tue, Sep 7, 2021 at 10:15 PM Saravana Kannan <saravanak@xxxxxxxxxx> wrote:
> > > >>
> > > >> On Tue, Sep 7, 2021 at 7:12 PM Andre Muller <andre.muller@xxxxxx> wrote:
> > > >>>
> > > >>> On 08/09/2021 00.05, Saravana Kannan wrote:
> > > >>>> On Sun, Sep 5, 2021 at 1:15 AM Andre Muller <andre.muller@xxxxxx> wrote:
> > > >>>>>
> > > >>>>> With linux-5.13 and linux-5.14, the internal drive and SD card reader are gone from the XO-1.5. I bisected the issue to come up with ea718c699055:
> > > >>>>>
> > > >>>>> # first bad commit: [ea718c699055c8566eb64432388a04974c43b2ea] Revert "Revert "driver core: Set fw_devlink=on by default""
> > > >>>>>
> > > >>>>> The /dev/mmcblk* nodes are not generated since this patch.
> > > >>>>>
> > > >>>>> Please find the output of lspsi -vv and lshw below.
> > > >>>>>
> > > >>>>> I will be happy to provide more info and/or test patches.
> > > >>>>
> > > >>>> Hi Andre,
> > > >>>>
> > > >>>> Can you point me to the dts file in upstream that corresponds to this system?
> > > >>>>
> > > >>>> Also, if you can give the output of:
> > > >>>> cat /sys/kernel/debug/devices_deferred
> > > >>>
> > > >>> Hi Saravana,
> > > >>>
> > > >>>
> > > >>> /sys/kernel/debug/devices_deferred is empty.
> > > >>> I used the last good commit b6f617.
> > > >>
> > > >> Sorry, I wanted that with the bad commit.
> > >
> > > Uh-oh, my bad...
> > >
> > > The bad case says
> > > # cat devices_deferred
> > > 0000:00:0c.0
> > >
> > > That's the SD Host controller.
> > >
> > > >>
> > > >>>
> > > >>> The XO-1.5 has an x86 compatible VIA C7 processor.
> > > >>> It uses the VX855 chip for about all I/O tasks, including SDIO.
> > > >>> I am not aware of a device tree file for it.
> > > >>>
> > > >>> It is a bit of a strange beast, it uses OFW to initialize the hardware and provide a FORTH shell.
> > > >>> Which also is the boot manager, configured via FORTH scripts.
> > > >>>
> > > >>> From the linux side of the fence, dmesg's line 2 is:
> > > >>>
> > > >>> "OFW detected in memory, cif @ 0xff83ae68 (reserving top 8MB)"
> > > >>>
> > > >>> AIUI, this mechanism is used in lieu of a device tree file, like UEFI on most x86 hardware.
> > > >>> But my understanding of device trees is severely limited, I might be allwrong.
> > > >>
> > > >> Uhh... I'm so confused. If Linux doesn't use OF, then none of the code
> > > >> enabled by fw_devlink=on should be executed.
> > > >
> > > > Linux does, but maybe not for memory (like UEFI on arm64).
> > > >
> > > >> The only thing that might remotely even execute is:
> > > >> efifb_add_links() in drivers/firmware/efi/efi-init.c
> > > >>
> > > >> If you want you can just do an early return 0; in that to see if it
> > > >> makes a difference (unlikely).
> > > >>
> > > >> Rob, Do you know what's going on with OLPC and DT?
> > > >
> > > > Not really. I have an XO-1 DT dump[1]. It's probably a similar looking
> > > > DT though. It's pretty ancient lacking anything we've invented for DT
> > > > in the last 10 years. There's not really much to it as about the only
> > > > phandle I see is for interrupts.
> > > >
> > > >>> Anyway, the firmware source is here:
> > > >>> http://dev.laptop.org/git/users/quozl/openfirmware/
> > > >>>
> > > >>> This file is the closest dt-analogous thing for the XO-1.5 I can find therein:
> > > >>> cpu/x86/pc/olpc/via/devices.fth
> > > >>
> > > >> That file is all gibberish to me.
> > > >
> > > > Running this on a booted system would help:
> > > >
> > > > dtc -f -I fs -O dts /proc/device-tree > dump.dts
> > >
> > > Ah, thanks. I never knew about the DT in there...
> > > XO-1.5_dump.dts is attached.
> > >
> > > >
> > > > If you don't have dtc on the system, then you'll have to zip up
> > > > /proc/device-tree contents and run dtc elsewhere (or just post that).
> > > >
> > > >>> My machine runs the latest version:
> > > >>> http://wiki.laptop.org/go/OLPC_Firmware_q3c17
> > > >>>
> > > >>> The XO-1.5 hardware specs are here:
> > > >>> http://wiki.laptop.org/images/f/f0/CL1B_Hdwe_Design_Spec.pdf
> > > >>> http://wiki.laptop.org/go/Hardware_specification_1.5
> > > >>>
> > > >>> Would the .config or dmesg help?
> > > >>
> > > >> At this point, why not? When you do send them, please send them as
> > > >> attachments and not inline.
> > > >>
> > > >> Also, when you collect the dmesg logs, the following could help:
> > > >> Enable the existing dev_dbg logs in these functions:
> > > >> device_link_add()
> > > >> device_links_check_suppliers()
> > > >>
> > > >> And add the following log to fwnode_link_add():
> > > >> +++ b/drivers/base/core.c
> > > >> @@ -87,6 +87,8 @@ int fwnode_link_add(struct fwnode_handle *con,
> > > >> struct fwnode_handle *sup)
> > > >> goto out;
> > > >> }
> > > >>
> > > >> + pr_info("Link fwnode %pfwP as a consumer of fwnode %pfwP\n", con, sup);
> > > >> +
> > > >
> > >
> > > OK. The dmesg with debug info is attached as well (for the broken case).
> >
> > Humm, ACPI and DT together...
> >
> > Looks to me like it's waiting for the wrong interrupt-parent. The log
> > says it is waiting for 'interrupt-controller@i20' which is the only
> > interrupt-controller found in the DT, but the parent is the PCI bridge
> > with whatever interrupt-map is pointing to. That's not clear as the
> > phandle (0x767a4) doesn't exist in the DT. I suppose the parent is
> > defined in ACPI?
>
> After staring at it for a while, I realized that
> interrupt-controller@i20 is indeed the right node. Looks like we need
> to do endian conversion of the ".node" property in the interrupt
> controller and it would match with 0x767a4.

Ah yes, I failed on doing the endian conversion.

> > pci 0000:00:0c.0: probe deferral - wait for supplier interrupt-controller@i20
> The SD controller is waiting forever on interrupt-controller@i20 to be
> added as a device.
>
> Rob,
>
> My guess is that the fwnode value is not getting set for ISA devices
> populated when isa@11 is added. Any idea how/where those child devices
> are populated? I thought they'd be platform devices, but it doesn't
> look like that's the case?

Sometimes they are. Sometimes there's no driver I think. I couldn't
figure out what code corresponds to this node in the kernel.

> > If there's not an easy fix, just disable devlinks for x86. There's
> > only one other DT platform, ce4100, and I really doubt it is even used
> > at all.
>
> I think the easy fix is to set the ISA device's fwnode when it's
> added, but I can't tell how they are getting added. But yeah, if that
> turns out to be hard, then I'd vote for disabling it for x86 too.

Just disable it.

Rob