Re: [PATCH v3 00/14] Driver of Intel(R) Gaussian & Neural Accelerator

From: Daniel Stone
Date: Mon May 17 2021 - 15:33:05 EST

Next message: Suren Baghdasaryan: "Re: [[RFC]PATCH] psi: fix race between psi_trigger_create and psimon"
Previous message: Sander Vanheule: "Re: [PATCH 0/5] RTL8231 GPIO expander support"
In reply to: Thomas Zimmermann: "Re: [PATCH v3 00/14] Driver of Intel(R) Gaussian & Neural Accelerator"
Next in thread: Thomas Zimmermann: "Re: [PATCH v3 00/14] Driver of Intel(R) Gaussian & Neural Accelerator"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi,

On Mon, 17 May 2021 at 20:12, Thomas Zimmermann <tzimmermann@xxxxxxx> wrote:
> Am 17.05.21 um 09:40 schrieb Daniel Vetter:
> > We have, it's called drivers/gpu. Feel free to rename to drivers/xpu or
> > think G as in General, not Graphisc.
>
> I hope this was a joke.
>
> Just some thoughts:
>
> AFAICT AI first came as an application of GPUs, but has now
> evolved/specialized into something of its own. I can imagine sharing
> some code among the various subsystems, say GEM/TTM internals for memory
> management. Besides that there's probably little that can be shared in
> the userspace interfaces. A GPU is device that puts an image onto the
> screen and an AI accelerator isn't.

But it isn't. A GPU is a device that has a kernel-arbitrated MMU
hosting kernel-managed buffers, executes user-supplied compiled
programs with reference to those buffers and other jobs, and informs
the kernel about progress.

KMS lies under the same third-level directory, but even when GPU and
display are on the same die, they're totally different IP blocks
developed on different schedules which are just periodically glued
together.

> Treating both as the same, even if
> they share similar chip architectures, seems like a stretch. They might
> evolve in different directions and fit less and less under the same
> umbrella.

Why not? All we have in common in GPU land right now is MMU + buffer
references + job scheduling + synchronisation. None of this has common
top-level API, or even a common top-level model. It's not just ISA
differences, but we have very old-school devices where the kernel
needs to register fill on every job, living next to middle-age devices
where the kernel and userspace co-operate to fill a ring buffer,
living next to modern devices where userspace does some stuff and then
the hardware makes it happen with the bare minimum of kernel
awareness.

Honestly I think there's more difference between lima and amdgpu then
there is between amdgpu and current NN/ML devices.

Cheers,
Daniel

Next message: Suren Baghdasaryan: "Re: [[RFC]PATCH] psi: fix race between psi_trigger_create and psimon"
Previous message: Sander Vanheule: "Re: [PATCH 0/5] RTL8231 GPIO expander support"
In reply to: Thomas Zimmermann: "Re: [PATCH v3 00/14] Driver of Intel(R) Gaussian & Neural Accelerator"
Next in thread: Thomas Zimmermann: "Re: [PATCH v3 00/14] Driver of Intel(R) Gaussian & Neural Accelerator"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]