Re: [RFC PATCH v3 0/3] Querying errors from drm_syncobj

From: Matthew Brost

Date: Thu Feb 26 2026 - 18:57:43 EST


On Wed, Feb 25, 2026 at 12:46:06PM +0000, Yicong Hui wrote:

I thought it was a very intentional choice that fences are a completion
mechanism only—they are not a mechanism to report or propagate errors.

This series seems to change that way of thinking—why?

Also consider these cases:

- An input dependency to a job has an error in its fence, and the output
of the job is installed in a syncobj. The job successfully runs but
produces garbage because of the bad input. The job’s fence will not
indicate an error because we don’t propagate input dependency errors to
the job. This makes DRM_SYNCOBJ_QUERY_FLAGS_ERROR seem a bit pointless
now.

- A driver, for whatever reason, sets fence->error, and this fence is
installed in a syncobj. Now user space starts using this new uAPI on
syncobjs and everything breaks. This is odd behavior from the driver,
but it was completely valid because fence->error never propagated to
user space.

I could probably come up with more examples of potential issues, but
let’s start with the above.

Matt

> This patch series adds 2 new flags, DRM_SYNCOBJ_QUERY_FLAGS_ERROR and
> DRM_SYNCOBJ_WAIT_FLAGS_ABORT_ON_ERROR for 3 ioctl operations
> DRM_IOCTL_SYNCOBJ_QUERY, DRM_IOCTL_SYNCOBJ_WAIT and
> DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT to allow them to batch-request error
> codes from multiple syncobjs and abort early upon error of any of them.
>
> Based on discussions from Michel Dänzer and Christian König, and a
> starter task from the DRM todo documentation.
>
> See https://gitlab.gnome.org/GNOME/mutter/-/issues/4624 for discussions
> on userspace implementation.
>
> I have looked into adding sub test cases into syncobj_wait.c and
> syncobj_timeline.c, igt-tests for this and I think I understand the
> process for writing tests and submitting them, however, these ioctls
> only trigger in the case that there is an error, but I am not sure what
> is the best way to artifically trigger an error from userspace in order
> to test that these ioctl flags work. What's the recommended way to
> approach this?
>
> ---
> Changes:
> v3:
> * Fixed inline comments by converting to multi-line comments in
> accordance to kernel style guidelines.
> * No longer using a separate superfluous function to walk the fence
> chain, and instead queries the last signaled fence in in the chain for
> its error code
> * Fixed types for error and handles array.
> * Used dma_fence_get_status to query error instead of getting it
> directly.
>
> v2:
> https://lore.kernel.org/dri-devel/20260220022631.2205037-1-yiconghui@xxxxxxxxx/T/#m6ab4f94a19c769193895d7728383f84e452cbbfa
> * Went from adding a new ioctl to implementing flags for existing
> ones.
>
> v1:
> * https://lore.kernel.org/all/20260213120836.81283-1-yiconghui@xxxxxxxxx/T/#mfdbc7f97e91ca5731b51b69c8cf8173cb0b2fb3e
>
> Yicong Hui (3):
> drm/syncobj: Add flag DRM_SYNCOBJ_QUERY_FLAGS_ERROR to query errors
> drm/syncobj: Add DRM_SYNCOBJ_WAIT_FLAGS_ABORT_ON_ERROR ioctl flag
> drm/syncobj/doc: Remove starter task from todo list
>
> Documentation/gpu/todo.rst | 16 ------------
> drivers/gpu/drm/drm_syncobj.c | 49 ++++++++++++++++++++++++++++++-----
> include/uapi/drm/drm.h | 11 ++++++++
> 3 files changed, 54 insertions(+), 22 deletions(-)
>
> --
> 2.53.0
>