Re: [PATCH 1/3] habanalabs: implement dma-fence mechanism

From: Daniel Vetter
Date: Mon Jul 13 2020 - 14:34:27 EST


On Mon, Jul 13, 2020 at 5:57 PM Greg Kroah-Hartman
<gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On Mon, Jul 13, 2020 at 06:54:22PM +0300, Oded Gabbay wrote:
> > From: Ofir Bitton <obitton@xxxxxxxxx>
> >
> > Instead of using standard dma-fence mechanism designed for GPU's, we
> > introduce our own implementation based on the former one. This
> > implementation is much more sparse than the original, contains only
> > mandatory functionality required by the driver.
>
> Sad you can't use the in-kernel code for this, I really don't understand
> what's wrong with using it as-is.
>
> Daniel, why do we need/want duplicate code floating around in the tree
> like this?

The rules around dma-fence are ridiculously strict, and it only makes
sense to inflict that upon you if you actually want to participate in
the cross driver uapi built up around dma-buf and dma-fence.

I've recently started some lockdep annotations to better enforce these
rules (and document them), and it's finding tons of subtle bugs even
in drivers/gpu (and I only just started with annotating drivers:

https://lore.kernel.org/dri-devel/20200707201229.472834-1-daniel.vetter@xxxxxxxx/

You really don't want to deal with this if you don't have to. If
drivers/gpu folks (who created this) aren't good enough to understand
it, maybe it's not a good idea to sprinkle this all over the tree. And
fundamentally all this is is a slightly fancier struct completion. Use
that one instead, or a wait_queue.

I discussed this a bit with Oded, and he thinks it's easier to
copypaste and simplify, but given that all other drivers seem to get
by perfectly well with completion or wait_queue, I'm not sure that's a
solid case.

Also adding Jason Gunthorpe, who very much suggested this should be
limited to dma-buf/gpu related usage only.

> Copying code leads to errors, here's some documentation ones:

Yeah except here reusing code without understanding what it does and
how it should be used leads to error :-) At least given by the
drivers/gpu track record, I'm pretty sure sprinkling my new dma_fence
lockdep annotations would lead to lots of splats.

Cheers, Daniel

>
> > --- /dev/null
> > +++ b/drivers/misc/habanalabs/hl_dma_fence.c
> > @@ -0,0 +1,338 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * Fence mechanism for dma-buf and to allow for asynchronous dma access
>
> Is that what this still does?
>
> > + *
> > + * Copyright (C) 2012 Canonical Ltd
> > + * Copyright (C) 2012 Texas Instruments
> > + *
> > + * Authors:
> > + * Rob Clark <robdclark@xxxxxxxxx>
> > + * Maarten Lankhorst <maarten.lankhorst@xxxxxxxxxxxxx>
> > + *
> > + * The dma_fence module is a copy of dma-fence at drivers/dma-buf.
>
> "The hl_dma_fence" module...
>
> And is it a stand-alone module? Or just a single file?
>
> > + * This was done due to an explicit request by GPU developers who asked not
> > + * to use the dma-buf module because we aren't part of DRM subsystem.
>
> Why is dma-buf only for use for DRM?
>
> If it is, should the symbol namespace be set to that to catch users that
> want to use it for their own code?
>
> > + * This copy was stripped from all extra features that habanalabs driver
> > + * doesn't use, including the uapi interface dma-buf exposes.
> > + * In addition, we removed the callbacks because the only usage is from inside
> > + * habanalabs driver
> > + */
> > +
> > +#include "hl_dma_fence.h"
> > +#include "habanalabs.h"
> > +#include <linux/slab.h>
> > +#include <linux/export.h>
> > +#include <linux/atomic.h>
> > +#include <linux/sched/signal.h>
> > +
> > +/**
> > + * DOC: DMA fences overview
> > + *
> > + * DMA fences, represented by &struct hl_dma_fence, are the kernel internal
> > + * synchronization primitive for DMA operations like GPU rendering, video
> > + * encoding/decoding, or displaying buffers on a screen.
>
> I don't think this is correct anymore, right? :(
>
> > --- /dev/null
> > +++ b/drivers/misc/habanalabs/hl_dma_fence.h
> > @@ -0,0 +1,148 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Fence mechanism for dma-buf to allow for asynchronous dma access
> > + *
> > + * Copyright (C) 2012 Canonical Ltd
> > + * Copyright (C) 2012 Texas Instruments
> > + *
> > + * Authors:
> > + * Rob Clark <robdclark@xxxxxxxxx>
> > + * Maarten Lankhorst <maarten.lankhorst@xxxxxxxxxxxxx>
> > + *
> > + * The dma_fence module is a copy of dma-fence at drivers/dma-buf.
>
> Same comments here for the .h file.
>
> thanks,
>
> greg k-h



--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch