[RFC PATCH 0/1] drm: Add doc about GPU reset
From: André Almeida
Date: Mon Jan 23 2023 - 15:27:50 EST
Due to the complexity of its stack and the apps that we run on it, GPU resets
are for granted. What's left for driver developers is how to make resets a
smooth experience as possible. While some OS's can recover or show an error
message in such cases, Linux is more a hit-and-miss due to its lack of
standardization and guidelines of what to do in such cases.
This is the goal of this document, to proper define what should happen after a
GPU reset so developers can start acting on top of this. An IGT test should be
created to validate this for each driver.
Initially my approach was to expose an uevent for GPU resets, as it can be seen
here[1]. However, even if an uevent can be useful for some use cases (e.g.
telemetry and error reporting), for the "OS integration" case of GPU resets
it would be more productive to have something defined through the stack.
Thanks,
André
[1] https://lore.kernel.org/amd-gfx/20221125175203.52481-1-andrealmeid@xxxxxxxxxx/
André Almeida (1):
drm: Create documentation about device resets
Documentation/gpu/drm-reset.rst | 51 +++++++++++++++++++++++++++++++++
Documentation/gpu/index.rst | 1 +
2 files changed, 52 insertions(+)
create mode 100644 Documentation/gpu/drm-reset.rst
--
2.39.1