Re: [PATCH v3 00/14] Adding GAUDI NIC code to habanalabs driver

From: Greg Kroah-Hartman
Date: Sun Sep 20 2020 - 04:47:09 EST


On Sat, Sep 19, 2020 at 04:22:35PM -0300, Jason Gunthorpe wrote:
> On Sat, Sep 19, 2020 at 07:27:30PM +0200, Greg Kroah-Hartman wrote:
> > > It's probably heresy, but why do I need to integrate into the RDMA subsystem ?
> > > I understand your reasoning about networking (Ethernet) as the driver
> > > connects to the kernel networking stack (netdev), but with RDMA the
> > > driver doesn't use or connect to anything in that stack. If I were to
> > > support IBverbs and declare that I support it, then of course I would
> > > need to integrate to the RDMA subsystem and add my backend to
> > > rdma-core.
> >
> > IBverbs are horrid and I would not wish them on anyone. Seriously.
>
> I'm curious what drives this opinion? Did you have it since you
> reviewed the initial submission all those years ago?

As I learned more about that interface, yes, I like it less and less :)

But that's the userspace api you all are stuck with, for various
reasons, my opinion doesn't matter here.

> > I think the general rdma apis are the key here, not the userspace api.
>
> Are you proposing that habana should have uAPI in drivers/misc and
> present a standard rdma-core userspace for it? This is the only
> userspace programming interface for RoCE HW. I think that would be
> much more work.
>
> If not, what open source userspace are you going to ask them to
> present to merge the kernel side into misc?

I don't think that they have a userspace api to their rdma feature from
what I understand, but I could be totally wrong as I do not know their
hardware at all, so I'll let them answer this question.

> > Note, I do not know exactly what they are, but no, IBverbs are not ok.
>
> Should we stop merging new drivers and abandon the RDMA subsystem? Is
> there something you'd like to see fixed?
>
> Don't really understand your position, sorry.

For anything that _has_ to have a userspace RMDA interface, sure ibverbs
are the one we are stuck with, but I didn't think that was the issue
here at all, which is why I wrote the above comments.

thanks,

greg k-h