Re: [PATCH V4 0/5] mlx5 ConnectX control misc driver

From: Jason Gunthorpe
Date: Fri Mar 22 2024 - 21:33:53 EST


On Fri, Mar 22, 2024 at 03:29:24PM -0700, Jakub Kicinski wrote:
> On Fri, 22 Mar 2024 18:44:23 -0300 Jason Gunthorpe wrote:
> > On Fri, Mar 22, 2024 at 01:58:26PM -0700, Jakub Kicinski wrote:
> > > > Well said, David.
> > > >
> > > > I would totally support doing something like this in a fairly generic
> > > > way that could be leveraged/instantiated by drivers that will allow
> > > > communication/inspection of hardware blocks in the datapath. There are
> > > > lots of different ways this could go, so feedback on this would help get
> > > > us all moving in the right direction.
> > >
> > > The more I learn, the more I am convinced that the technical
> > > justifications here are just smoke and mirrors.
> >
> > Let's see some evidence of this then, point to some sillicon devices
> > in the multibillion gate space that don't have complex FW built into
> > their design?
>
> Existence of complex FW does not imply that production systems must
> have a backdoor to talk to that FW in kernel-unmitigated fashion.

I think we've been over this endlessly, production systems require a
solid debugging story for all software in the system, including the
device FW.

> As an existence proof I give you NICs we use at Meta.
> Or old Netronome NICs, you can pick.

I wouldn't pick those, they don't meet the multi-billion gate
criteria. ie expensive chips built on the most cutting edge and
expensive process. Smaller chips on non-cutting edge process have
different economics. Startups have different economics.

Look at something like an Intel/AMD GPU or Habana labs device. Lack of
FW creates a very opaque driver that is twiddling megabytes of
register mnemonics that is incomprehensible without HW
documentation. A win for code availability, but it hasn't actually
crossed into being usefully open.

> > Despite all of those having built devices like this well before the
> > "AI gold rush" and it being a general overall design principle for the
> > industry because, yes, the silicon technology available actually
> > demands it.
> >
> > It is not to say you couldn't do otherwise, it is just simply too
> > expensive.
>
> I do agree that it is expensive, not sure if it's "too" expensive.
> But Linux never promised that our way of doing SW development would
> always be the most cost effective option, right? Especially short
> term. Or that we'll be competitive time to market.

By "too expensive" I mean the vendor cannot produce a chip at a price
that is salable. Or a startup goes out of business because it can't
afford to respin.

Linux promises community collaboration where the community members
broadly should be driving the priorities. When TTM matters and is
agreed it is done. See for instance the various massive security
fixes.

> > > RDMA is what it is but I really hate how you're trying to pretend
> > > that it's is somehow an inherent need of advanced technology and
> > > we need to lower the openness standards for all of the kernel.
> >
> > Open hardware has never been an "openness standard" for the kernel.
>
> I was in the meeting with a vendor this morning and when explicitly
> asked by an SRE (not from my org nor in any way "primed" by me)
> whether configuration of some run of the mill PCI thing can be
> exposed

"run of the mill PCI thing"? Does this thing already have devlink
knob? Usually "run of the mill PCI things" are configured through the
PCI subsystem, not devlink.

> via devlink params instead of whatever proprietary thing the vendor was
> pitching, the vendor's answer was silence and then a pitch of another
> proprietary mechanism.

Our team was very excited about devlink when it first came about. But
now we have so many devlink parameters that have been kept out of
mainline I see that the excitement has died.

> So no, the "open hardware" is certainly not a requirement for the
> kernel. But users can't get vendors to implement standard Linux
> configuration interfaces, and your proposal will make it a lot worse.

I don't agree. If you can't get your vendor to implement the thing you
want on devlink right now today, this fwctl isn't going to change that
one bit. You already have the vendor tool and the vendor telling you
to use it. It makes no difference at all how the existing vendor tools
reach the device.

Indeed, counting on lockdown to break all the existing vendor tools
and render them permanently unusable seems to me to be straining one
of the few hard full project rules Linux actually has: don't break
existing userspace.

Think bigger, maybe your SRE will be happier if as part of this we can
get the vendors to agree on some common userspace tooling for device
configuration! Wouldn't that be a great big ecosystem improvement!

Frankly, there are many forms of common interfaces and many paths to
get there. I don't view your very restrictive approach to be helpful
to growing this space. It hasn't delivered a working ecosystem, and I
don't think it ever will. Instead, as this thread is showing, we have
a bunch of unhappy community members and poor upstream support for
devices.

Jason