Re: [PATCH 00/15] Adding GAUDI NIC code to habanalabs driver

From: Oded Gabbay
Date: Thu Sep 10 2020 - 17:18:05 EST


On Fri, Sep 11, 2020 at 12:05 AM Florian Fainelli <f.fainelli@xxxxxxxxx> wrote:
>
>
>
> On 9/10/2020 1:32 PM, Oded Gabbay wrote:
> > On Thu, Sep 10, 2020 at 11:28 PM Jakub Kicinski <kuba@xxxxxxxxxx> wrote:
> >>
> >> On Thu, 10 Sep 2020 23:16:22 +0300 Oded Gabbay wrote:
> >>> On Thu, Sep 10, 2020 at 11:01 PM Jakub Kicinski <kuba@xxxxxxxxxx> wrote:
> >>>> On Thu, 10 Sep 2020 19:11:11 +0300 Oded Gabbay wrote:
> >>>>> create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic.c
> >>>>> create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic.h
> >>>>> create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic_dcbnl.c
> >>>>> create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic_debugfs.c
> >>>>> create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_nic_ethtool.c
> >>>>> create mode 100644 drivers/misc/habanalabs/gaudi/gaudi_phy.c
> >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qm0_masks.h
> >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qm0_regs.h
> >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qm1_regs.h
> >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qpc0_masks.h
> >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qpc0_regs.h
> >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_qpc1_regs.h
> >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_rxb_regs.h
> >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_rxe0_masks.h
> >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_rxe0_regs.h
> >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_rxe1_regs.h
> >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_stat_regs.h
> >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_tmr_regs.h
> >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txe0_masks.h
> >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txe0_regs.h
> >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txe1_regs.h
> >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txs0_masks.h
> >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txs0_regs.h
> >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic0_txs1_regs.h
> >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic1_qm0_regs.h
> >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic1_qm1_regs.h
> >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic2_qm0_regs.h
> >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic2_qm1_regs.h
> >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic3_qm0_regs.h
> >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic3_qm1_regs.h
> >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic4_qm0_regs.h
> >>>>> create mode 100644 drivers/misc/habanalabs/include/gaudi/asic_reg/nic4_qm1_regs.h
> >>>>> create mode 100644 drivers/misc/habanalabs/include/hw_ip/nic/nic_general.h
> >>>>
> >>>> The relevant code needs to live under drivers/net/(ethernet/).
> >>>> For one thing our automation won't trigger for drivers in random
> >>>> (/misc) part of the tree.
> >>>
> >>> Can you please elaborate on how to do this with a single driver that
> >>> is already in misc ?
> >>> As I mentioned in the cover letter, we are not developing a
> >>> stand-alone NIC. We have a deep-learning accelerator with a NIC
> >>> interface.
> >>> Therefore, we don't have a separate PCI physical function for the NIC
> >>> and I can't have a second driver registering to it.
> >>
> >> Is it not possible to move the files and still build them into a single
> >> module?
> > hmm...
> > I actually didn't try that as I thought it will be very strange and
> > I'm not familiar with other drivers that build as a single ko but have
> > files spread out in different subsystems.
> > I don't feel it is a better option than what we did here.
> >
> > Will I need to split pull requests to different subsystem maintainers
> > ? For the same driver ?
> > Sounds to me this is not going to fly.
>
> Not necessarily, you can post your patches to all relevant lists and
> seek maintainer review/acked-by tags from the relevant maintainers. This
> is not unheard of with mlx5 for instance.
Yeah, I see what you are saying, the problem is that sometimes,
because everything is tightly integrated in our SOC, the patches
contain code from common code (common to ALL our ASICs, even those who
don't have NIC at all), GAUDI specific code which is not NIC related
and the NIC code itself.
But I guess that as a last resort if this is a *must* I can do that.
Though I would like to hear Greg's opinion on this as he is my current
maintainer.

Personally I do want to send relevant patches to netdev because I want
to get your expert reviews on them, but still keep the code in a
single location.

>
> Have you considered using notifiers to get your NIC driver registered
> while the NIC code lives in a different module?
Yes, and I prefered to keep it simple. I didn't want to start sending
notifications to the NIC driver every time, for example, I needed to
reset the SOC because a compute engine got stuck. Or vice-versa - when
some error happened in the NIC to start sending notifications to the
common driver.

In addition, from my AMD days, we had a very tough time managing two
drivers that "talk" to each other and manage the same H/W. I'm talking
about amdgpu for graphics and amdkfd for compute (which I was the
maintainer). AMD is working in the past years to unite those two
drivers to get out of that mess. That's why I didn't want to go down
that road.

Thanks,
Oded

> --
> Florian