Re: [PATCH v1 5/5] driver core: Set fw_devlink=on by default

From: Marc Zyngier
Date: Mon Jan 18 2021 - 14:37:27 EST

On 2021-01-18 19:16, Geert Uytterhoeven wrote:
Hi Marc,

On Mon, Jan 18, 2021 at 6:59 PM Marc Zyngier <maz@xxxxxxxxxx> wrote:
On 2021-01-18 17:39, Geert Uytterhoeven wrote:
> On Fri, Dec 18, 2020 at 4:34 AM Saravana Kannan <saravanak@xxxxxxxxxx>
> wrote:
>> Cyclic dependencies in some firmware was one of the last remaining
>> reasons fw_devlink=on couldn't be set by default. Now that cyclic
>> dependencies don't block probing, set fw_devlink=on by default.
>> Setting fw_devlink=on by default brings a bunch of benefits
>> (currently,
>> only for systems with device tree firmware):
>> * Significantly cuts down deferred probes.
>> * Device probe is effectively attempted in graph order.
>> * Makes it much easier to load drivers as modules without having to
>> worry about functional dependencies between modules (depmod is still
>> needed for symbol dependencies).
>> If this patch prevents some devices from probing, it's very likely due
>> to the system having one or more device drivers that "probe"/set up a
>> device (DT node with compatible property) without creating a struct
>> device for it. If we hit such cases, the device drivers need to be
>> fixed so that they populate struct devices and probe them like normal
>> device drivers so that the driver core is aware of the devices and
>> their
>> status. See [1] for an example of such a case.
>> [1] -
>> Signed-off-by: Saravana Kannan <saravanak@xxxxxxxxxx>
> Shimoda-san reported that next-20210111 and later fail to boot
> on Renesas R-Car Gen3 platforms. No output is seen, unless earlycon
> is enabled.
> I have bisected this to commit e590474768f1cc04 ("driver core: Set
> fw_devlink=on by default").

There is a tentative patch from Saravana here[1], which works around
some issues on my RK3399 platform, and it'd be interesting to find
out whether that helps on your system.




Thanks for the suggestion, but given no devices probe (incl. GPIO
providers), I'm afraid it won't help. [testing] Indeed.

With the debug prints in device_links_check_suppliers enabled, and
some postprocessing, I get:

255 supplier e6180000.system-controller not ready
9 supplier fe990000.iommu not ready
9 supplier fe980000.iommu not ready
6 supplier febd0000.iommu not ready
6 supplier ec670000.iommu not ready
3 supplier febe0000.iommu not ready
3 supplier e7740000.iommu not ready
3 supplier e6740000.iommu not ready
3 supplier e65ee000.usb-phy not ready
3 supplier e6570000.iommu not ready
3 supplier e6054000.gpio not ready
3 supplier e6053000.gpio not ready

As everything is part of a PM Domain, the (lack of the) system controller
must be the culprit. What's wrong with it? It is registered very early in
the boot:

[ 0.142096] rcar_sysc_pd_init:442: of_genpd_add_provider_onecell() returned 0

Yeah, this looks like the exact same problem. The devlink stuff assumes
that because there is a "compatible" property, there will be a driver
directly associated with the node containing this property.

If any other node has a reference to that first node, the dependency
will only get resolved if/when that first node is bound to a driver.
Trouble is, there are *tons* of code in the tree that invalidate
this heuristic, and for each occurrence of this we get another failure.

The patch I referred to papers over it by registering a dummy driver,
but that doesn't scale easily...

Jazz is not dead. It just smells funny...