Re: fwnode_for_each_child_node() and OF backend discrepancy

From: Michael Walle
Date: Tue Jun 28 2022 - 11:09:29 EST


Am 2022-06-28 16:36, schrieb Krzysztof Kozlowski:
On 28/06/2022 16:22, Michael Walle wrote:
I can't follow you here. Please note, that you need the actual
physical port number. It's not a made up number, but corresponds
to a physical port on that ethernet switch. So you can't just skip
the disabled ones. port@2 must have port number 2.

The number "2" you get from the reg property, so where is exactly the
problem?

That you need to get the total number of ports of the hardware (which
is also used for things beyond validation, eg. during switch init
all ports will are disabled). And right now, that is done by counting
all the nodes - which is bad, I guess we agree here.

It's bad also from another reason - the DT node was explicitly disabled,
but you perform some operation on actual hardware representing this
node. I would assume that a disabled DT node means it is not
operational, e.g. hardware not present or missing clocks, so you should
not treat it as another meaning - power down/unused.

Mh. Assume a SoC with an integrated ethernet switch. Some ports
are externally connected, some don't. I'd think they should be disabled,
no? Until now, all bindings I know, treat them as disabled. But OTOH
you still need to do some configurations on them, like disable port
forwarding, disable them or whatever. So the hardware is present, but
it is not connected to anything.

But it works,
as long as no ports are disabled and all ports are described in the
device tree. But I have device trees where some are disabled.

I am not sure if I follow this. You have devices which
1. have unused ports, but all DT nodes are available/okay,
2. have unused ports, which are in DT status=disabled?

Doesn't case 2 break the bindings? If so, we don't care about such
out-of-tree users. We cannot support all of possible weird combinations
in out-of-tree DTS files...

Case 1 is invalid I think.

How does case 2 break the binding? It breaks the driver, yes. But not
the binding. I agree on the out-of-tree argument, *but* isn't that
what the binding is for, that out-of-tree device trees gonna work as
long as they follow the binding? And I don't see where it dictates that all
nodes must be enabled; nor that it must either be 2 or 8 children nodes.

I assume, you cannot read the hardware itself to get the number of
physical ports; and we have the compatible "microchip,lan966x-switch",
which is the generic one, so it could be the LAN9668 (with 8 ports)
or the LAN9662 (with 2 ports).

I'll keep that argument for future when I see again patches adding such
wildcard compatible. :) I had to discuss with some folks...

Although the compatible difference does not have to be important here,
because one could say the 9662 and 9668 programming model is exaclty the
same and they differ by number of ports. This number of ports can be a
dedicated property or counted from the children (if they were all
available).

Mh. Rob was always in favor of dedicated compatible strings. And I
think there are also subtle differences. Eg. the LAN9662 has some kind
of accelerating engine, if I'm not mistaken.

So what do you prefer:

compatible = "microchip,lan9668";
and
compatible = "microchip,lan9662";

or

compatible = "microchip,lan966x";
microchip,num-phys-ports = <8>;
and
compatible = "microchip,lan966x";
microchip,num-phys-ports = <2>;
microchip,accelerating-engine;
..

The argument here was always, we don't want too much properties if
it can be deduced by the compatible string.

We somehow have to retain backwards
compatibility. Thus my idea was to at least make the handling slightly
better and count *any* child nodes. So it doesn't fall apart with
disabled
nodes. Then introduce proper compatible strings
"microchip,lan9668-switch"
and use that to hardcode the num_phys_ports to 8. But there will be
device trees with microchip,lan966x-switch out there, which we do want
to support.

I see the following options:
(1) just don't care and get rid of the "microchip,lan966x-switch"
compatible
(2) quick fix for the older kernels by counting all the nodes and
proper fix for the newer kernels with dedicated compatibles
(3) no fix for older kernels, introduce new compatibles for new
kernels

I propose this one. I would not care about out-of-tree DTSes which
decided to disable random stuff and expect things working. :)

I'd argue, that is the usual case for all the switch bindings I
know of; not some unusual config. E.g. the SoC dtsi disables all
ports by default and only the ones which are actually connected
by the board are then enabled in the board dts, see
arch/arm/boot/dts/lan966x.dtsi
arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi

That being said, I don't care too much about the older kernels.
So I'm fine with (3).

-michael


(4) keep generic compatible if the hardware can be read out to get
the number of ports.

I think (1) isn't the way to go. (2) was my try until I noticed
that odd fwnode behavior. But judging by this thread, I don't think
thats possible. I don't know if (4) is possible at all. If not we'd
be left with (3).