Re: [RFC PATCH 0/3] new subsystem for compute accelerator devices

From: Alex Deucher
Date: Mon Oct 24 2022 - 11:18:32 EST


On Sat, Oct 22, 2022 at 5:46 PM Oded Gabbay <ogabbay@xxxxxxxxxx> wrote:
>
> In the last couple of months we had a discussion [1] about creating a new
> subsystem for compute accelerator devices in the kernel.
>
> After an analysis that was done by DRM maintainers and myself, and following
> a BOF session at the Linux Plumbers conference a few weeks ago [2], we
> decided to create a new subsystem that will use the DRM subsystem's code and
> functionality. i.e. the accel core code will be part of the DRM subsystem.
>
> This will allow us to leverage the extensive DRM code-base and
> collaborate with DRM developers that have experience with this type of
> devices. In addition, new features that will be added for the accelerator
> drivers can be of use to GPU drivers as well (e.g. RAS).
>
> As agreed in the BOF session, the accelerator devices will be exposed to
> user-space with a new, dedicated device char files and a dedicated major
> number (261), to clearly separate them from graphic cards and the graphic
> user-space s/w stack. Furthermore, the drivers will be located in a separate
> place in the kernel tree (drivers/accel/).
>
> This series of patches is the first step in this direction as it adds the
> necessary infrastructure for accelerator devices to DRM. The new devices will
> be exposed with the following convention:
>
> device char files - /dev/accel/accel*
> sysfs - /sys/class/accel/accel*/
> debugfs - /sys/kernel/debug/accel/accel*/
>
> I tried to reuse the existing DRM code as much as possible, while keeping it
> readable and maintainable.

Wouldn't something like this:
https://patchwork.freedesktop.org/series/109575/
Be simpler and provide better backwards compatibility for existing
non-gfx devices in the drm subsystem as well as newer devices?

Alex

>
> One thing that is missing from this series is defining a namespace for the
> new accel subsystem, while I'll add in the next iteration of this patch-set,
> after I will receive feedback from the community.
>
> As for drivers, once this series will be accepted (after adding the namespace),
> I will start working on migrating the habanalabs driver to the new accel
> subsystem. I have talked about it with Dave and we agreed that it will be
> a good start to simply move the driver as-is with minimal changes, and then
> start working on the driver's individual features that will be either added
> to the accel core code (with or without changes), or will be removed and
> instead the driver will use existing DRM code.
>
> In addition, I know of at least 3 or 4 drivers that were submitted for review
> and are good candidates to be included in this new subsystem, instead of being
> a drm render node driver or a misc driver.
>
> [1] https://lkml.org/lkml/2022/7/31/83
> [2] https://airlied.blogspot.com/2022/09/accelerators-bof-outcomes-summary.html
>
> Thanks,
> Oded
>
> Oded Gabbay (3):
> drivers/accel: add new kconfig and update MAINTAINERS
> drm: define new accel major and register it
> drm: add dedicated minor for accelerator devices
>
> Documentation/admin-guide/devices.txt | 5 +
> MAINTAINERS | 8 +
> drivers/Kconfig | 2 +
> drivers/accel/Kconfig | 24 +++
> drivers/gpu/drm/drm_drv.c | 214 +++++++++++++++++++++-----
> drivers/gpu/drm/drm_file.c | 69 ++++++---
> drivers/gpu/drm/drm_internal.h | 5 +-
> drivers/gpu/drm/drm_sysfs.c | 81 +++++++++-
> include/drm/drm_device.h | 3 +
> include/drm/drm_drv.h | 8 +
> include/drm/drm_file.h | 21 ++-
> include/drm/drm_ioctl.h | 1 +
> 12 files changed, 374 insertions(+), 67 deletions(-)
> create mode 100644 drivers/accel/Kconfig
>
> --
> 2.34.1
>