Re: [PATCH v5 1/1] drm/doc: Document DRM device reset expectations

From: André Almeida
Date: Tue Jun 27 2023 - 17:31:48 EST


Hi Marek,

Em 27/06/2023 15:57, Marek Olšák escreveu:
On Tue, Jun 27, 2023, 09:23 André Almeida <andrealmeid@xxxxxxxxxx <mailto:andrealmeid@xxxxxxxxxx>> wrote:

+User Mode Driver
+----------------
+
+The UMD should check before submitting new commands to the KMD if
the device has
+been reset, and this can be checked more often if the UMD requires
it. After
+detecting a reset, UMD will then proceed to report it to the
application using
+the appropriate API error code, as explained in the section below about
+robustness.


The UMD won't check the device status before every command submission due to ioctl overhead. Instead, the KMD should skip command submission and return an error that it was skipped.

I wrote like this because when reading the source code for vk::check_status()[0] and Gallium's si_flush_gfx_cs()[1], I was under the impression that UMD checks the reset status before every submission/flush.

Is your comment about of how things are currently implemented, or how they would ideally work? Either way I can apply your suggestion, I just want to make it clear.

[0] https://elixir.bootlin.com/mesa/mesa-23.1.3/source/src/vulkan/runtime/vk_device.h#L142
[1] https://elixir.bootlin.com/mesa/mesa-23.1.3/source/src/gallium/drivers/radeonsi/si_gfx_cs.c#L83


The only case where that won't be applicable is user queues where drivers don't call into the kernel to submit work, but they do call into the kernel to create a dma_fence. In that case, the call to create a dma_fence can fail with an error.

Marek