Re: [RFC PATCH v3 1/3] drm/syncobj: Add flag DRM_SYNCOBJ_QUERY_FLAGS_ERROR to query errors
From: Christian König
Date: Fri Feb 27 2026 - 05:19:58 EST
On 2/27/26 01:23, Matthew Brost wrote:
...
>> @@ -1702,16 +1708,28 @@ int drm_syncobj_query_ioctl(struct drm_device *dev, void *data,
>> point = dma_fence_is_signaled(last_signaled) ?
>> last_signaled->seqno :
>> to_dma_fence_chain(last_signaled)->prev_seqno;
>> +
>> + error = dma_fence_get_status(last_signaled);
>> }
>> dma_fence_put(last_signaled);
>> } else {
>> point = 0;
>> + error = fence ? dma_fence_get_status(fence) : 0;
>
> dma_fence_get_status returns 0 (unsignaled), 1 (signaled with no error),
> or fence->error (signaled with error != 0).
>
> Is it intentional to return 1 to user space for a signaled fence? What
> if a driver sets fence->error to 1?
>
> Side note: the fence error kernel doc says fence->error is only valid if
> < 0, but dma_fence_get_status doesn’t enforce that.
dma_fence_get_status() enforces this with a WARN_ON().
> Also, returning fence->error directly to user space seems like a massive
> problem. Right now, drivers can set fence->error to whatever they want,
> but now this gets reported to user space and suddenly has meaning. Does
> user space take certain actions based on the specific error code (e.g.,
> -ECANCELED, -ETIME, etc.)? It certainly can’t, because we have no
> internal kernel standards for what fence->error actually means. Two
> different drivers could assign the same error code but mean entirely
> different things—or the opposite could be true.
That is not even remotely true. fence->error is already used in the UAPI for syncfiles for like 10years or so. Android is massively relying on that.
There is also documentation on what values drivers should use: https://elixir.bootlin.com/linux/v6.19.3/source/include/linux/dma-fence.h#L565
The error reporting was just missing from drm_syncobj and only implemented for syncfiles and that's what this patch set here is fixing.
Regards,
Christian.
>
> Thus, without some standardization plus fixing every single driver, I
> really think the best we can report in a generic mechanism like a
> syncobj is simply “error” or “no error."
>
> Matt
>
>> }
>> dma_fence_put(fence);
>> +
>> ret = copy_to_user(&points[i], &point, sizeof(uint64_t));
>> ret = ret ? -EFAULT : 0;
>> if (ret)
>> break;
>> +
>> + if (args->flags & DRM_SYNCOBJ_QUERY_FLAGS_ERROR) {
>> + ret = copy_to_user(&handles[i], &error, sizeof(*handles));
>> +
>> + ret = ret ? -EFAULT : 0;
>> + if (ret)
>> + break;
>> + }
>> }
>> drm_syncobj_array_free(syncobjs, args->count_handles);
>>
>> diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h
>> index 27cc159c1d27..213b4dc9b612 100644
>> --- a/include/uapi/drm/drm.h
>> +++ b/include/uapi/drm/drm.h
>> @@ -1044,6 +1044,11 @@ struct drm_syncobj_array {
>> };
>>
>> #define DRM_SYNCOBJ_QUERY_FLAGS_LAST_SUBMITTED (1 << 0) /* last available point on timeline syncobj */
>> +/*
>> + * Copy the status of the fence as output into the handles array.
>> + * The handles array is overwritten by that.
>> + */
>> +#define DRM_SYNCOBJ_QUERY_FLAGS_ERROR (1 << 1)
>> struct drm_syncobj_timeline_array {
>> __u64 handles;
>> __u64 points;
>> --
>> 2.53.0
>>