Re: [PATCH net-next 5/7] net: marvell: prestera: add LAG support

From: Jakub Kicinski
Date: Tue Feb 09 2021 - 13:49:23 EST


On Tue, 09 Feb 2021 12:56:55 +0100 Tobias Waldekranz wrote:
> > I ask myself that question pretty much every day. Sadly I have no clear
> > answer.
>
> Thank you for your candid answer, really appreciate it. I do not envy
> you one bit, making those decisions must be extremely hard.
>
> > Silicon is cheap, you can embed a reasonable ARM or Risc-V core in the
> > chip for the area and power draw comparable to one high speed serdes
> > lane.
> >
> > The drivers landing in the kernel are increasingly meaningless. My day
> > job is working for a hyperscaler. Even though we have one of the most
> > capable kernel teams on the planet most of issues with HW we face
> > result in "something is wrong with the FW, let's call the vendor".
>
> Right, and being a hyperscaler probably at least gets you some attention
> when you call your vendor. My day job is working for a nanoscaler, so my
> experience is that we must be prepared to solve all issues in-house; if
> we get any help from the vendor that is just a bonus.
>
> > And even when I say "drivers landing" it is an overstatement.
> > If you look at high speed anything these days the drivers cover
> > multiple generations of hardware, seems like ~5 years ago most
> > NIC vendors reached sufficient FW saturation to cover up differences
> > between HW generations.
> >
> > At the same time some FW is necessary. Certain chip functions, are
> > best driven by a micro-controller running a tight control loop.
>
> I agree. But I still do not understand why vendors cling to the source
> of these like it was their wallet. That is the beauty of selling
> silicon; you can fully leverage OSS and still have a very straight
> forward business model.

Vendors want to be able to "add value", lock users in and sell support.
Users adding features themselves hurts their bottom line. Take a look
at income breakdown for publicly traded companies. There were also
rumors recently about certain huge silicon vendor revoking the SDK
license from a NOS company after that company got bought.

Business people make rational choices, trust me. It's on us to make
rational choices in the interest of the community (incl. our users).

> > The complexity of FW is a spectrum, from basic to Qualcomm.
> > The problem is there is no way for us to know what FW is hiding
> > by just looking at the driver.
> >
> > Where do we draw the line?
>
> Yeah it is a very hard problem. In this particular case though, the
> vendor explicitly said that what they have done is compiled their
> existing SDK to run on the ASIC:
>
> https://lore.kernel.org/netdev/BN6PR18MB1587EB225C6B80BF35A44EBFBA5A0@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>
> So there is no reason that it could not be done as a proper driver.

I guess you meant "no _technical_ reason" ;)

> > Personally I'd really like to see us pushing back stronger.
>
> Hear, hear!