Re: [PATCH V4 0/5] mlx5 ConnectX control misc driver

From: Jason Gunthorpe
Date: Thu Apr 04 2024 - 13:48:05 EST


On Thu, Apr 04, 2024 at 07:48:50AM -0700, Jakub Kicinski wrote:
> On Thu, 4 Apr 2024 09:23:38 -0300 Jason Gunthorpe wrote:
> > > "didn't understand the discussion" is an ironic thing for you to +1,
> > > David. After all my emails about HNS3 RDMA you somehow concluded today
> > > that I want to make rules for the entire kernel:
> > > https://lore.kernel.org/all/6faa47b0-27c3-47f9-94be-1ec671d9543c@xxxxxxxxxx/
> >
> > What if (hypothetically) I tould you that the congestion control
> > settings in the device FW impacted netdev sourced ethernet trafic as
> > well? Would you be so sanguine that RDMA should have those settings?
>
> We can lawyer the words until the cows come home.
> The team I work on takes care of both RoCE/IB/pick your fav proto
> and TCP/IP NICs. It's fairly obvious what is RoCE and what is TCP
> or user UDP when there are no incentives to act otherwise :|

Sure you can tell the difference, that isn't my hypothetical. I'm
asking what if the people who designed the device choose not to tell
the difference?

> > > And I second what Ed said. I have asked multiple vendors preaching
> > > impossibilism in this thread to start posting those knobs. I offered
> > > to do a quick off-list review of the list of knobs they have to give
> > > a quick yay / nay, so they don't waste time implementing things that
> > > would get nacked. None of the vendors bothered taking me up on that
> > > offer.
> >
> > As far as configuration/provisioning goes, it is really all or
> > nothing.
> >
> > If a specific site can configure only 90% of the stuff required
> > because you will NAK the missing 10% it then it is still not usable
> > and is a wasted effort for everyone.
>
> (a) are you saying that the device needs 100% of the knobs to be used?
> oof, you better warn your prospective customers :S

No, I'm saying I have 100 customers, 600 configurables and every
customer needs a partially overlapping set of 30 of them to be
different than the COTS manufacturing default.

I can implement 30 and support one customer, but I can't support all
100 customers without all 600 knobs.

> (b) as Ed pointed out some of the "knobs" are just hacks and lazy
> workarounds so we rejected them for quality reasons; the remaining
> rejects are because the knobs aren't really device specific, but
> vendors don't want to extend existing APIs, as it is easier to
> ship "features" without having a core kernel dependency...

Which is back to my point. You are picking and choosing what gets to
be supported, and the end result is none of the 100 customers get to
actually work.

It is overreach to demand that the devices be re-designed as a
condition to be part of mainline. The configurables exist as they are
and need to be supported, in one way or another, by the kernel.

> > You have never shown that there is a path to 100% with your approach
> > to devlink. In fact I believe you've said flat out that 100% is not
> > achievable. Right here you illustrate the fundamental problem again:
> > there are configurables that already exist in the device that you will
> > NAK for devlink.
> >
> > This is fundamentally why no one is taking you up on these generous
> > offers to pre-NAK device's designs. You made it explicit that you will
> > will NAK something and then it is not 100%.
> >
> > Saeed has said repeatedly he wants 100% of the endless configurables
> > in mlx5. You have the manual and know what they are, tell him how to
> > get to 100% in a few months of work and I will believe you that it is
> > not impossible.
>
> Sorry, are you saying that I'm responsible for a providing a solution
> to allow arbitrary vendor tools to work and proprietary user space to
> communicate directly with the proprietary firmware?

I am responding to your remark about "vendors preaching
impossibilism". If you want me to agree with you that it is possible
then yes, you need to show a way where we get to a point that users
are actually able to solve their problems.

Otherwise all I hear is how you are going to NAK some unknowable
subset of the needed configurables. Sure sounds impossible to me.

Jason