Re: fwnode_for_each_child_node() and OF backend discrepancy
From: Michael Walle
Date: Tue Jun 28 2022 - 10:22:29 EST
Am 2022-06-28 15:51, schrieb Krzysztof Kozlowski:
On 28/06/2022 15:47, Michael Walle wrote:
[adding Horatiu Vultur, because we now digress to the bug
in the switch, rather than that odd OF behavior]
Am 2022-06-28 15:29, schrieb Andy Shevchenko:
On Tue, Jun 28, 2022 at 3:23 PM Michael Walle <michael@xxxxxxxx>
wrote:
I was trying to fix the lan966x driver [1] which doesn't work if
there
are disabled nodes in between.
Can you elaborate what's wrong now in the behaviour of the driver?
In
the code it uses twice the _available variant.
Imagine the following device tree snippet:
port0 {
reg = <0>;
status = "okay";
}
port1 {
reg = <1>;
status = "disabled";
}
port@2 {
reg = <2>;
status = "okay";
}
The driver will set num_phys_ports to 2. When port@2 is probed, it
will have the (correct!) physical port number 2. That will then
trigger various EINVAL checks with "port_num >= num_phys_ports" or
WARN()s.
It means the above mentioned condition is wrong: it should be
"port_idx >= num_phys_ports" (if the port_idx doesn't exists, that's
the bug in the first place)
I can't follow you here. Please note, that you need the actual
physical port number. It's not a made up number, but corresponds
to a physical port on that ethernet switch. So you can't just skip
the disabled ones. port@2 must have port number 2.
The number "2" you get from the reg property, so where is exactly the
problem?
That you need to get the total number of ports of the hardware (which
is also used for things beyond validation, eg. during switch init
all ports will are disabled). And right now, that is done by counting
all the nodes - which is bad, I guess we agree here. But it works,
as long as no ports are disabled and all ports are described in the
device tree. But I have device trees where some are disabled.
I assume, you cannot read the hardware itself to get the number of
physical ports; and we have the compatible "microchip,lan966x-switch",
which is the generic one, so it could be the LAN9668 (with 8 ports)
or the LAN9662 (with 2 ports). We somehow have to retain backwards
compatibility. Thus my idea was to at least make the handling slightly
better and count *any* child nodes. So it doesn't fall apart with
disabled
nodes. Then introduce proper compatible strings
"microchip,lan9668-switch"
and use that to hardcode the num_phys_ports to 8. But there will be
device trees with microchip,lan966x-switch out there, which we do want
to support.
I see the following options:
(1) just don't care and get rid of the "microchip,lan966x-switch"
compatible
(2) quick fix for the older kernels by counting all the nodes and
proper fix for the newer kernels with dedicated compatibles
(3) no fix for older kernels, introduce new compatibles for new
kernels
(4) keep generic compatible if the hardware can be read out to get
the number of ports.
I think (1) isn't the way to go. (2) was my try until I noticed
that odd fwnode behavior. But judging by this thread, I don't think
thats possible. I don't know if (4) is possible at all. If not we'd
be left with (3).
If you want to validate it against some maximum number of ports, based
on DTS, it makes no sense... One can supply a DTS with port number
10000
and 10000 nodes, or with specific property "maximum-port-number=10000".
It would make sense if you validate it against driver-hard-coded values
(which you later mentioned in your reply).
Yes, see above.
-michael