Re: [RFC PATCH 0/7] A General Accelerator Framework, WarpDrive

From: Alan Cox
Date: Fri Aug 03 2018 - 10:21:31 EST


> If we are going to have any kind of general purpose accelerator API then
> > it has to be able to implement things like
>
> Why is the existing driver model not good enough ? So you want
> a device with function X you look into /dev/X (for instance
> for GPU you look in /dev/dri)

Except when my GPU is in an FPGA in which case it might be somewhere else
or it's a general purpose accelerator that happens to be usable as a GPU.
Unusual today in big computer space but you'll find it in
microcontrollers.

> Each of those device need a userspace driver and thus this
> user space driver can easily knows where to look. I do not
> expect that every application will reimplement those drivers
> but instead use some kind of library that provide a high
> level API for each of those devices.

Think about it from the user level. You have a pipeline of things you
wish to execute, you need to get the right accelerator combinations and
they need to fit together to meet system constraints like number of
IOMMU ids the accelerator supports, where they are connected.

> Now you have a hierarchy of memory for the CPU (HBM, local
> node main memory aka you DDR dimm, persistent memory) each

It's not a heirarchy, it's a graph. There's no fundamental reason two
accelerators can't be close to two different CPU cores but have shared
HBM that is far from each processor. There are physical reasons it tends
to look more like a heirarchy today.

> Anyway i think finding devices and finding relation between
> devices and memory is 2 separate problems and as such should
> be handled separatly.

At a certain level they are deeply intertwined because you need a common
API. It's not good if I want a particular accelerator and need to then
see which API its under on this machine and which interface I have to
use, and maybe have a mix of FPGA, WarpDrive and Google ASIC interfaces
all different.

The job of the kernel is to impose some kind of sanity and unity on this
lot.

All of it in the end comes down to

'Somehow glue some chunk of memory into my address space and find any
supporting driver I need'

plus virtualization of the above.

That bit's easy - but making it usable is a different story.

Alan