Re: [PATCH v3 00/14] Adding GAUDI NIC code to habanalabs driver
From: Jason Gunthorpe
Date: Fri Sep 18 2020 - 10:19:15 EST
On Fri, Sep 18, 2020 at 05:12:04PM +0300, Oded Gabbay wrote:
> On Fri, Sep 18, 2020 at 4:59 PM Jason Gunthorpe <jgg@xxxxxxxx> wrote:
> >
> > On Fri, Sep 18, 2020 at 04:49:25PM +0300, Oded Gabbay wrote:
> > > On Fri, Sep 18, 2020 at 4:26 PM Jason Gunthorpe <jgg@xxxxxxxx> wrote:
> > > >
> > > > On Fri, Sep 18, 2020 at 04:02:24PM +0300, Oded Gabbay wrote:
> > > >
> > > > > The problem with MR is that the API doesn't let us return a new VA. It
> > > > > forces us to use the original VA that the Host OS allocated.
> > > >
> > > > If using the common MR API you'd have to assign a unique linear range
> > > > in the single device address map and record both the IOVA and the MMU
> > > > VA in the kernel struct.
> > > >
> > > > Then when submitting work using that MR lkey the kernel will adjust
> > > > the work VA using the equation (WORK_VA - IOVA) + MMU_VA before
> > > > forwarding to HW.
> > > >
> > > We can't do that. That will kill the performance. If for every
> > > submission I need to modify the packet's contents, the throughput will
> > > go downhill.
> >
> > You clearly didn't read where I explained there is a fast path and
> > slow path expectation.
> >
> > > Also, submissions to our RDMA qmans are coupled with submissions to
> > > our DMA/Compute QMANs. We can't separate those to different API calls.
> > > That will also kill performance and in addition, will prevent us from
> > > synchronizing all the engines.
> >
> > Not sure I see why this is a problem. I already explained the fast
> > device specific path.
> >
> > As long as the kernel maintains proper security when it processes
> > submissions the driver can allow objects to cross between the two
> > domains.
> Can you please explain what you mean by "two domains" ?
> You mean the RDMA and compute domains ? Or something else ?
Yes
> What I was trying to say is that I don't want the application to split
> its submissions to different system calls.
If you can manage the security then you can cross them. Eg since The
RDMA PD would be created on top of the /dev/misc char dev then it is
fine for the /dev/misc char dev to access the RDMA objects as a 'dv
fast path'.
But now that you say everything is interconnected, I'm wondering,
without HW security how do you keep netdev isolated from userspace?
Can I issue commands to /dev/misc and write to kernel memory (does the
kernel put any pages into the single MMU?) or corrupt the netdev
driver operations in any way?
Jason