Re: [PATCH 2/6] amd/amdkfd: Ignore return code of dma_fence_signal()
From: Christian König
Date: Thu Nov 27 2025 - 04:55:36 EST
On 11/27/25 10:48, Philipp Stanner wrote:
> On Wed, 2025-11-26 at 16:24 -0500, Kuehling, Felix wrote:
>>
>> On 2025-11-26 08:19, Philipp Stanner wrote:
>>> The return code of dma_fence_signal() is not really useful as there is
>>> nothing reasonable to do if a fence was already signaled. That return
>>> code shall be removed from the kernel.
>>>
>>> Ignore dma_fence_signal()'s return code.
>>
>> I think this is not correct. Looking at the comment in
>> evict_process_worker, we use the return value to decide a race
>> conditions where multiple threads are trying to signal the eviction
>> fence. Only one of them should schedule the restore work. And the other
>> ones need to increment the reference count to keep evictions balanced.
>
> Thank you for pointing that out. Seems then amdkfd is the only user who
> actually relies on the feature. See below
>
>>
>> Regards,
>> Felix
>>
>>
>>>
>>> Suggested-by: Christian König <christian.koenig@xxxxxxx>
>>> Signed-off-by: Philipp Stanner <phasta@xxxxxxxxxx>
>>> ---
>>> drivers/gpu/drm/amd/amdkfd/kfd_process.c | 5 ++---
>>> 1 file changed, 2 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>> index ddfe30c13e9d..950fafa4b3c3 100644
>>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
>>> @@ -1986,7 +1986,6 @@ kfd_process_gpuid_from_node(struct kfd_process *p, struct kfd_node *node,
>>> static int signal_eviction_fence(struct kfd_process *p)
>>> {
>>> struct dma_fence *ef;
>>> - int ret;
>>>
>>> rcu_read_lock();
>>> ef = dma_fence_get_rcu_safe(&p->ef);
>>> @@ -1994,10 +1993,10 @@ static int signal_eviction_fence(struct kfd_process *p)
>>> if (!ef)
>>> return -EINVAL;
>>>
>>> - ret = dma_fence_signal(ef);
>>> + dma_fence_signal(ef);
>
> The issue now is that dma_fence_signal()'s return code is actually non-
> racy, because check + bit-set are protected by lock.
>
> Christian's new spinlock series would add a lock function for fences:
> https://lore.kernel.org/dri-devel/20251113145332.16805-5-christian.koenig@xxxxxxx/
>
>
> So I suppose this should work:
>
> dma_fence_lock_irqsave(ef, flags);
> if (dma_fence_test_signaled_flag(ef)) {
> dma_fence_unlock_irqrestore(ef, flags);
> return true;
> }
> dma_fence_signal_locked(ef);
> dma_fence_unlock_irqrestore(ef, flags);
>
> return false;
>
>
> + some cosmetic adjustments for the boolean of course.
>
>
> Would that fly and be reasonable? @Felix, Christian.
I was just about to reply with the same idea when your mail arrived.
So yes looks totally reasonable to me.
Regards,
Christian.
>
>
> P.