Re: [net-next: PATCH v3 6/8] net: core: switch to fwnode_find_net_device_by_node()
From: Vladimir Oltean
Date: Thu Jul 28 2022 - 15:56:22 EST
On Thu, Jul 28, 2022 at 08:47:58AM +0200, Marcin Wojtas wrote:
> > The 'label' property of a port was optional, you've made it mandatory by accident.
> > It is used only by DSA drivers that register using platform data.
>
> Thanks for spotting that, I will make it optional again.
>
> > (side note, I can't believe you actually have a 'label' property for the
> > CPU port and how many people are in the same situation as you; you know
> > it isn't used for anything, right? how do we stop the cargo cult?)
>
> Well, given these results:
> ~/git/linux : git grep 'label = \"cpu\"' arch/arm/boot/dts/ | wc -l
> 79
> ~/git/linux : git grep 'label = \"cpu\"' arch/arm64/boot/dts/ | wc -l
> 14
>
> It's not a surprise I also have it defined in the platforms I test. I
> agree it doesn't serve any purpose in terms of creating the devices in
> DSA with DT, but it IMO increases readability of the description at
> least.
We've glided over this way too easily, so I'll repeat this thing I've said:
| One can have udev rules that assign names to Ethernet ports. I think
| that is even encouraged; some of the things in DSA predate the
| establishment of some best practices.
I know I'm not exactly "upfront" by saying this at v3 rather than earlier,
but I haven't had the time and I still don't have as much as I'd like.
Sorry for that.
Please don't jump to sending v4 just yet, and please don't expect that
this patch set will make it for the upcoming 5.20 release candidates.
ACPI is a whole new world and I don't think we want to mass-migrate each
and every OF binding that DSA has to the generic fwnode form, at least
not without having a serious discussion about it.
The 'label' thing is actually one of the things that I'm seriously
considering skipping parsing if this is an ACPI system, simply because
best practices are different today than they were when the OF bindings
were created.
There's also the change that validates that phylink has the fwnode
properties it needs to work properly:
https://patchwork.kernel.org/project/netdevbpf/patch/20220723164635.1621911-1-vladimir.oltean@xxxxxxx/
Please don't even think that the DSA fwnode conversion will be merged
before the validation patch (sorry, I'm not saying this to block you,
I'm saying this because I don't want DSA to start with zero-day baggage
on ACPI).
And even when the validation patch gets merged, you'll need to adapt it
to fwnode because that's what will be required syntactically, but we'll
only go through the motions of calling of_device_compatible_match() for
the OF case. With ACPI, every driver will opt into strict validation,
that's non negotiable.
And then there are some other issues we've learned about, with the DT
bindings that specific drivers such as mv88e6xxx and realtek-smi have.
I'll give you more details once we get to the actual mv88e6xxx
conversion to ACPI; currently my memory lacks some of the precise
details of how come mv88e6xxx came to not observe the issue but
realtek-smi did. Anyway, the issue was that fw_devlink causes the
internal PHYs to probe with the generic rather than the specific PHY
driver, if interrupts are being used (and provided by the switch as
'interrupt-controller').
It can be debated what exactly is at fault there, although one
interpretation can be that the DT bindings themselves are to blame,
for describing a circular dependency between a parent and a child.
I've been suggesting the authors of new drivers to take an alternative
approach and describe the switch chip as a MFD, with only the actual
switching component being probed by DSA and the rest being separate
drivers:
https://lore.kernel.org/all/20211204022037.dkipkk42qet4u7go@skbuf/T/
You'll say, ok but don't we have to keep maintaining mv8e6xxx OF bindings
functional with fw_devlink too anyway, so what benefit would there be if
for ACPI we'd split the exact same monolithic driver into a MFD?
And maybe you have a point, I don't know, I haven't actually tried to
look at the code and see if it could be restructured cleanly to probe
and work in both cases.
The bottom line is that if you haven't received too much review for your
series until now, I suspect it's because none of the DSA maintainers
cropped a large enough chunk of time yet to actually clarify things for
themselves. I don't consider the things I've pointed out here to be a
'review' in the proper sense, they're just cases I'm thinking about in
the back of my mind where we should learn from past mistakes.
I'll revisit when I will have come to some conclusions.