Re: [PATCH v3 00/14] Adding GAUDI NIC code to habanalabs driver
From: Oded Gabbay
Date: Fri Sep 18 2020 - 07:56:41 EST
On Fri, Sep 18, 2020 at 2:52 PM Leon Romanovsky <leon@xxxxxxxxxx> wrote:
>
> On Fri, Sep 18, 2020 at 02:36:10PM +0300, Gal Pressman wrote:
> > On 17/09/2020 20:18, Jason Gunthorpe wrote:
> > > On Tue, Sep 15, 2020 at 11:46:58PM +0300, Oded Gabbay wrote:
> > >> infrastructure for communication between multiple accelerators. Same
> > >> as Nvidia uses NVlink, we use RDMA that we have inside our ASIC.
> > >> The RDMA implementation we did does NOT support some basic RDMA
> > >> IBverbs (such as MR and PD) and therefore, we can't use the rdma-core
> > >> library or to connect to the rdma infrastructure in the kernel.
> > >
> > > You can't create a parallel RDMA subsystem in netdev, or in misc, and
> > > you can't add random device offloads as IOCTL to nedevs.
> > >
> > > RDMA is the proper home for all the networking offloads that don't fit
> > > into netdev.
> > >
> > > EFA was able to fit into rdma-core/etc and it isn't even RoCE at
> > > all. I'm sure this can too.
> >
> > Well, EFA wasn't welcomed to the RDMA subsystem with open arms ;), initially it
> > was suggested to go through the vfio subsystem instead.
> >
> > I think this comes back to the discussion we had when EFA was upstreamed, which
> > is what's the bar to get accepted to the RDMA subsystem.
> > IIRC, what we eventually agreed on is having a userspace rdma-core provider and
> > ibv_{ud,rc}_pingpong working (or just supporting one of the IB spec's QP types?).
> >
> > Does GAUDI fit these requirements? If not, should it be in a different subsystem
> > or should we open the "what qualifies as an RDMA device" question again?
>
> I want to remind you that rdma-core requirement came to make sure that
> anything exposed from the RDMA to the userspace is strict with proper
> UAPI header hygiene.
>
> I doubt that Havana's ioctls are backed by anything like this.
>
> Thanks
Why do you doubt that ? Have you looked at our code ?
Our uapi and IOCTLs interface is based on drm subsystem uapi interface
and it is very safe and protected.
Otherwise Greg would have never allowed me to go upstream in the first place.
We have a single function which is the entry point for all the IOCTLs
of our drivers (only one IOCTL is RDMA related, all the others are
compute related).
That function is almost 1:1 copy of the function in drm.
Thanks,
Oded