Re: [PATCH v3 00/14] Adding GAUDI NIC code to habanalabs driver

From: Oded Gabbay
Date: Sun Sep 20 2020 - 15:06:09 EST


On Sun, Sep 20, 2020 at 11:47 AM Greg Kroah-Hartman
<gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On Sat, Sep 19, 2020 at 04:22:35PM -0300, Jason Gunthorpe wrote:
> > On Sat, Sep 19, 2020 at 07:27:30PM +0200, Greg Kroah-Hartman wrote:
> > > > It's probably heresy, but why do I need to integrate into the RDMA subsystem ?
> > > > I understand your reasoning about networking (Ethernet) as the driver
> > > > connects to the kernel networking stack (netdev), but with RDMA the
> > > > driver doesn't use or connect to anything in that stack. If I were to
> > > > support IBverbs and declare that I support it, then of course I would
> > > > need to integrate to the RDMA subsystem and add my backend to
> > > > rdma-core.
> > >
> > > IBverbs are horrid and I would not wish them on anyone. Seriously.
> >
> > I'm curious what drives this opinion? Did you have it since you
> > reviewed the initial submission all those years ago?
>
> As I learned more about that interface, yes, I like it less and less :)
>
> But that's the userspace api you all are stuck with, for various
> reasons, my opinion doesn't matter here.
>
> > > I think the general rdma apis are the key here, not the userspace api.
> >
> > Are you proposing that habana should have uAPI in drivers/misc and
> > present a standard rdma-core userspace for it? This is the only
> > userspace programming interface for RoCE HW. I think that would be
> > much more work.
> >
> > If not, what open source userspace are you going to ask them to
> > present to merge the kernel side into misc?
>
> I don't think that they have a userspace api to their rdma feature from
> what I understand, but I could be totally wrong as I do not know their
> hardware at all, so I'll let them answer this question.

Hi Greg,
We do expose a new IOCTL to enable the user to configure connections
between multiple GAUDI devices.

Having said that, we restrict this IOCTL to be used only by the same
user who is doing the compute on our device, as opposed to a real RDMA
device where any number of applications can send and receive.
In addition, this IOCTL limits the user to connect ONLY to another
GAUDI device and not to a 3rd party RDMA device.

It is true that GAUDI supports RDMA data movement but the data
movement is NOT done by the user. It is done by our compute engines.
i.e. the compute engines performs "send" and "receive" without going
to the host (aka no support for ibv_postsend, ibv_postreceive). The
only thing that is controlled by the user is to say which GAUDI is
connected to which. After that, the command submission the user
performs to operate our compute engines will cause them to send and
receive RDMA packets.

Moreover, as opposed to smart NICs where the Networking is the main
focus and the compute is only secondary, in our device the compute is
our major focus and the networking is a slave for it.

The hl-thunk userspace library will have wrappers around this single
IOCTL (like all our driver's IOCTLs) and also contain demos to show
how to use it.


>
> > > Note, I do not know exactly what they are, but no, IBverbs are not ok.
> >
> > Should we stop merging new drivers and abandon the RDMA subsystem? Is
> > there something you'd like to see fixed?
> >
> > Don't really understand your position, sorry.
>
> For anything that _has_ to have a userspace RMDA interface, sure ibverbs
> are the one we are stuck with, but I didn't think that was the issue
> here at all, which is why I wrote the above comments.
To emphasize again, we don't want to expose a userspace RDMA interface.
We just want to allow our single compute user to configure a
connection to another GAUDI.

Thanks,
Oded

>
> thanks,
>
> greg k-h