Re: [PATCH 1/2] dma-fence: Rename dma_fence_is_signaled()

From: Christian König
Date: Thu Apr 10 2025 - 04:38:23 EST


Am 09.04.25 um 17:04 schrieb Philipp Stanner:
On Wed, 2025-04-09 at 16:10 +0200, Christian König wrote:
I only see improvement by making things more obvious.

In any case, how would you call a wrapper that just does
test_bit(IS_SIGNALED, …) ?
Broken, that was very intentionally removed quite shortly after we
created the framework.

We have a few cases were implementations do check that for their
fences, but consumers should never be allowed to touch such
internals.
There is theory and there is practice. In practice, those internals are
being used by Nouveau, i915, Xe, vmgfx and radeon.

What do you mean? I only skimmed over the use cases, but as far as I can see those are all valid.

You can test the flag if you know what the fence means to you, that is not a problem at all.

So it seems that we failed quite a bit at communicating clearly how the
interface should be used.

And, to repeat myself, with both name and docu of that function, I
think it is very easy to misunderstand what it's doing. You say that it
shouldn't matter – and maybe that's true, in theory. In practice, it
does matter. In practice, APIs get misused and have side-effects. And
making that harder is desirable.

That sounds like I didn't used the right wording.

It *must* not matter to the consumer. See the purpose of the DMA-fence framework is to make it irrelevant for the consumer how the provider has implemented it's fences.

This means that things like if polling or interrupt driven signaling is used, 32bit vs 64bit seq numbers, etc... should all be hidden by the framework from the consumer of the fences.


BTW I'm actually not sure if nouveau has a bug here. As far as I can see nouveau_fence_signal() will be called later eventually and do the necessary cleanup.

But on the other hand it wouldn't surprise me if nouveau has a bug with that. The driver has been basically only barely maintained for quite a while.

In any case, I might have to add another such call to Nouveau, because
the solution preferred by you over the callback causes another race.
Certainly one could solve this in a clean way, but someone has to do
the work, and we're talking about more than a few hours here.

Well this is not my preferred solution, it's just the technical correct solution as far as I can see.

In any case, be so kind and look at patch 2 and tell me there if you're
at least OK with making the documentation more detailed.

As far as I can see that is clearly the wrong place to document that stuff.

Regards,
Christian.


P.
From 30b9574baadcf860104e61ce2767f9999bbdd77d Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Christian=20K=C3=B6nig?= <christian.koenig@xxxxxxx>
Date: Thu, 10 Apr 2025 10:18:29 +0200
Subject: [PATCH] drm/nouveau: fix and cleanup fence handling
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The fence was not removed from the pending list when signaled from the
.signaled callback. Fix that and also remove the superflous
.enable_signaling callback.

Signed-off-by: Christian König <christian.koenig@xxxxxxx>
---
drivers/gpu/drm/nouveau/nouveau_fence.c | 31 +++++++------------------
1 file changed, 8 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c
index 7cc84472cece..53c70ddef964 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
@@ -485,32 +485,18 @@ static bool nouveau_fence_is_signaled(struct dma_fence *f)
ret = (int)(fctx->read(chan) - fence->base.seqno) >= 0;
rcu_read_unlock();

- return ret;
-}
-
-static bool nouveau_fence_no_signaling(struct dma_fence *f)
-{
- struct nouveau_fence *fence = from_fence(f);
-
- /*
- * caller should have a reference on the fence,
- * else fence could get freed here
- */
- WARN_ON(kref_read(&fence->base.refcount) <= 1);
+ if (ret) {
+ /*
+ * caller should have a reference on the fence,
+ * else fence could get freed here
+ */
+ WARN_ON(kref_read(&fence->base.refcount) <= 1);

- /*
- * This needs uevents to work correctly, but dma_fence_add_callback relies on
- * being able to enable signaling. It will still get signaled eventually,
- * just not right away.
- */
- if (nouveau_fence_is_signaled(f)) {
list_del(&fence->head);
-
dma_fence_put(&fence->base);
- return false;
}

- return true;
+ return ret;
}

static void nouveau_fence_release(struct dma_fence *f)
@@ -525,7 +511,6 @@ static void nouveau_fence_release(struct dma_fence *f)
static const struct dma_fence_ops nouveau_fence_ops_legacy = {
.get_driver_name = nouveau_fence_get_get_driver_name,
.get_timeline_name = nouveau_fence_get_timeline_name,
- .enable_signaling = nouveau_fence_no_signaling,
.signaled = nouveau_fence_is_signaled,
.wait = nouveau_fence_wait_legacy,
.release = nouveau_fence_release
@@ -540,7 +525,7 @@ static bool nouveau_fence_enable_signaling(struct dma_fence *f)
if (!fctx->notify_ref++)
nvif_event_allow(&fctx->event);

- ret = nouveau_fence_no_signaling(f);
+ ret = nouveau_fence_is_signaled(f);
if (ret)
set_bit(DMA_FENCE_FLAG_USER_BITS, &fence->base.flags);
else if (!--fctx->notify_ref)
--
2.34.1