Re: [RFC PATCH v3 0/3] Querying errors from drm_syncobj

From: Christian König

Date: Fri Feb 27 2026 - 05:14:49 EST


On 2/27/26 00:56, Matthew Brost wrote:
> On Wed, Feb 25, 2026 at 12:46:06PM +0000, Yicong Hui wrote:
>
> I thought it was a very intentional choice that fences are a completion
> mechanism only—they are not a mechanism to report or propagate errors.
>
> This series seems to change that way of thinking—why?

We have already changed that a long long time ago. See the whole error reporting for syncfiles.

It was just missing for drm_syncobj which this patch set here fixes.

Regards,
Christian.

>
> Also consider these cases:
>
> - An input dependency to a job has an error in its fence, and the output
> of the job is installed in a syncobj. The job successfully runs but
> produces garbage because of the bad input. The job’s fence will not
> indicate an error because we don’t propagate input dependency errors to
> the job. This makes DRM_SYNCOBJ_QUERY_FLAGS_ERROR seem a bit pointless
> now.
>
> - A driver, for whatever reason, sets fence->error, and this fence is
> installed in a syncobj. Now user space starts using this new uAPI on
> syncobjs and everything breaks. This is odd behavior from the driver,
> but it was completely valid because fence->error never propagated to
> user space.
>
> I could probably come up with more examples of potential issues, but
> let’s start with the above.
>
> Matt
>
>> This patch series adds 2 new flags, DRM_SYNCOBJ_QUERY_FLAGS_ERROR and
>> DRM_SYNCOBJ_WAIT_FLAGS_ABORT_ON_ERROR for 3 ioctl operations
>> DRM_IOCTL_SYNCOBJ_QUERY, DRM_IOCTL_SYNCOBJ_WAIT and
>> DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT to allow them to batch-request error
>> codes from multiple syncobjs and abort early upon error of any of them.
>>
>> Based on discussions from Michel Dänzer and Christian König, and a
>> starter task from the DRM todo documentation.
>>
>> See https://gitlab.gnome.org/GNOME/mutter/-/issues/4624 for discussions
>> on userspace implementation.
>>
>> I have looked into adding sub test cases into syncobj_wait.c and
>> syncobj_timeline.c, igt-tests for this and I think I understand the
>> process for writing tests and submitting them, however, these ioctls
>> only trigger in the case that there is an error, but I am not sure what
>> is the best way to artifically trigger an error from userspace in order
>> to test that these ioctl flags work. What's the recommended way to
>> approach this?
>>
>> ---
>> Changes:
>> v3:
>> * Fixed inline comments by converting to multi-line comments in
>> accordance to kernel style guidelines.
>> * No longer using a separate superfluous function to walk the fence
>> chain, and instead queries the last signaled fence in in the chain for
>> its error code
>> * Fixed types for error and handles array.
>> * Used dma_fence_get_status to query error instead of getting it
>> directly.
>>
>> v2:
>> https://lore.kernel.org/dri-devel/20260220022631.2205037-1-yiconghui@xxxxxxxxx/T/#m6ab4f94a19c769193895d7728383f84e452cbbfa
>> * Went from adding a new ioctl to implementing flags for existing
>> ones.
>>
>> v1:
>> * https://lore.kernel.org/all/20260213120836.81283-1-yiconghui@xxxxxxxxx/T/#mfdbc7f97e91ca5731b51b69c8cf8173cb0b2fb3e
>>
>> Yicong Hui (3):
>> drm/syncobj: Add flag DRM_SYNCOBJ_QUERY_FLAGS_ERROR to query errors
>> drm/syncobj: Add DRM_SYNCOBJ_WAIT_FLAGS_ABORT_ON_ERROR ioctl flag
>> drm/syncobj/doc: Remove starter task from todo list
>>
>> Documentation/gpu/todo.rst | 16 ------------
>> drivers/gpu/drm/drm_syncobj.c | 49 ++++++++++++++++++++++++++++++-----
>> include/uapi/drm/drm.h | 11 ++++++++
>> 3 files changed, 54 insertions(+), 22 deletions(-)
>>
>> --
>> 2.53.0
>>