Re: On brcm80211 maintenance and support

From: Arend van Spriel
Date: Thu Oct 12 2023 - 04:43:08 EST


On 10/11/2023 1:28 PM, 'Hector Martin' via BRCM80211-DEV-LIST,PDL wrote:


On 2023/10/11 19:23, Kalle Valo wrote:
Neal Gompa <neal@xxxxxxxxx> writes:

On Sat, Oct 7, 2023 at 8:51 AM Hector Martin <marcan@xxxxxxxxx> wrote:


On 07/10/2023 00.48, Dmitry Antipov wrote:
On 10/6/23 18:34, Hector Martin wrote:

For better or worse, if nobody else does, I'm willing to sign up to
maintain the chips shipping on Apple ARM64 machines (i.e. BCM4378,
BCM4387, BCM4388 - that last one I have bringup for downstream, just got
it done this week) and partially BCM4377 as a bonus (since I have access
to an older Intel Mac with that one, and already did bringup for it,
though my access is sporadic). I'm already playing part time maintainer
anyway (other folks have already sent us patches I'll have to upstream),
and we need this driver to keep working and continue to support new chips.

Good news. Would you capable to consider some generic (not hooked to any
particular hardware) things like [1] ?

[1]
https://lore.kernel.org/linux-wireless/20230703162458.155942-1-dmantipov@xxxxxxxxx/


Sure, I've done cleanup type stuff myself too.


Can we please get this done so that the pile of Broadcom patches can
actually start landing again? It's been frustrating watching patch
submissions be ignored for over a year now. At least add Hector as a
co-maintainer and allow him to land stuff people have been using
outside to get Broadcom Wi-Fi to *work*.

Having stuff sit on the pile and be *ignored* is frustrating for
contributors and users, and massively disincentivizes people from
working in upstream Linux.

Your email reminds me of this comic:

https://xkcd.com/2347/

In the last few years we seem to be getting more of these "Work faster!"
emails and honestly it's getting frustrating for us maintainers. If
Linux wireless is important for you then help us! You can review
patches, run tests on real hardware, write hwsim test cases[1], fix
compiler warnings[2] etc. to help us maintainers and speed up
development. There's so much to do and while you gain experience on the
wireless development you can help even more.

Also take it into account that it's not just simple to "take patches"
and be done with it. There are high quality requirements, the code needs
to have no compiler warnings and must not cause any regressions in
existing setups. That's not easy at all, especially as our hardware
testing is basically limited to few the most active drivers. And let
alone there are very exotic hardware out there and it's impossible to
test all of them.

I think we need to qualify this "no regressions" "rule". People regress
our machines in mainline all the time. We catch it and get it fixed,
sometimes in RCs, sometimes it goes all the way to stable and needs to
get fixed there. We've had patches break everything from Bluetooth LE
pairing to core memory management to the point we needed earlycon to
diagnose it. That last one was a blatantly wrong patch to core Linux MM,
it wasn't even something specific to our platform, but even that got
past review (it just happened to break us specifically due to a
coincidence).

The review process doesn't magically avoid regressing things. "No
regressions" is impossible without someone actually testing things.
Claiming otherwise is dishonest. So if I end up as maintainer here I
certainly am not going to promise "no regressions" for chips I can't
test, without someone interested in those chips testing them. Of course
I'll take regression fixes when they do happen and someone notices, but
I can't know in advance until someone does.

I do not think Kalle was claiming that. Regression will happen as it is simply not possible to verify each patch on all devices supported by the driver.

Consider a patch that changes some codepath in the driver. I can't know
whether the original logic was always broken, or whether it worked on
some chips, and whether the new logic works on those chips or will
regress them, without testing. This is a regular occurrence with
brcmfmac, due to the complexity and variability of the firmware
interface. Often multiple versions of stuff are supported, or some
structures can be extended, but sometimes they can't. It's a mess, and
without firmware source code nor any official specs, there is no way to
know exactly what is intended and what the backwards compatibility
requirements are.

The only way to avoid that is to gratuitously introduce version/chip
gates for *every single change affecting behavior from the firmware
POV*, which is a complete non-starter and would quickly yield a giant
mess of spaghetti code. It's bad enough having to support explicit
ABI-breaking changes in firmware, and having to deal with multiple
versions of huge structures and convert between them. Trying to outright
keep existing logic identical for "other chips" is just not going to
happen, not without first having confirmation of what the requirements
are from someone who has the required docs/source.

I have a patch to enable WPA3 in Broadcom chipsets (yes, the driver is
in such a sorry state it doesn't even support that yet). The current
support attempt was added by a Cypress engineer and uses a completely
different firmware mechanism. Is that supposed to actually work? Does it
work currently? Is that the case for all Cypress firmwares? Or only
some? Does the alternate mechanism we have for Broadcom chips work too?
Only Cypress can answer those questions ahead of time, and they aren't
(they ignored me last time I brought this up). So my current patch just
replaces the mechanism with the known-working one for Broadcom chips.

This is mainly why I introduced the vendor-split concept so we can keep the Cypress mechanism and allow a different mechanism for Broadcom chips. The Cypress mechanism did not work for the Broadcom chips I have so I wanted to test it on the Cypress chips I got shipped long ago and they simply do not come up. Have not tried with RPi as it is not running vanilla kernel. Could try with a backports driver.

Next time I send a bunch of our downstream patches, I'm going to resend
that WPA3 patch. And if it regresses Cypress WPA3 support, tough luck.
If someone catches it (Phil?) and it turns out the existing support is
the only way to do things on Cypress, I'll rework the patch to
conditionally support both styles. But if nobody does, and nobody cares,
then I'm sorry if I regress things, but there is no way to avoid it. We
can't be gratuitously gating every single change just to hope to avoid
regressions, without any confirmation of what is required. That just
doesn't scale.

The "no regressions" rule is an important one throughout the kernel. Trying to avoid it makes sense, but I have to agree that for brcmfmac there is not enough coverage to give any guarantee. As long you are prepared to fix things when a regression report comes in that should be fine depending on how often there is a need to revert/fix things as it can disrupt the normal flow of patches.

Regards,
Arend

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature