Hi Sasha
So obviously great that Microsoft is trying to upstream all this, and
very much welcome and all that.
But I guess there's a bunch of rather fundamental issues before we
look into any kind of code details. And that might make this quite a
hard sell for upstream to drivers/gpu subsystem:
- From the blog it sounds like the userspace is all closed. That
includes the hw specific part and compiler chunks, all stuff we've
generally expected to be able to look in the past for any kind of
other driver. It's event documented here:
https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.html#open-source-userspace-requirements
What's your plan here?
btw since the main goal here (at least at first) seems to be get
compute and ML going the official work-around here is to relabel your
driver as an accelerator driver (just sed -e s/vGPU/vaccel/ over the
entire thing or so) and then Olof and Greg will take it into
drivers/accel ...
- Next up (but that's not really a surprise for a fresh vendor driver)
at a more technical level, this seems to reinvent the world, from
device enumeration (why is this not exposed as /dev/dri/card0 so it
better integrates with existing linux desktop stuff, in case that
becomes a goal ever) down to reinvented kref_put_mutex (and please
look at drm_device->struct_mutex for an example of how bad of a
nightmare that locking pattern is and how many years it took us to
untangle that one.
- Why DX12 on linux? Looking at this feels like classic divide and
conquer (or well triple E from the 90s), we have vk, we have
drm_syncobj, we have an entire ecosystem of winsys layers that work
across vendors. Is the plan here that we get a dx12 driver for other
hw mesa drivers from you guys, so this is all consistent and we have a
nice linux platform? How does this integrate everywhere else with
linux winsys standards, like dma-buf for passing stuff around,
dma-fence/sync_file/drm_syncobj for syncing, drm_fourcc/modifiers for
some idea how it all meshes together?
- There's been a pile of hallway track/private discussions about
moving on from the buffer-based memory managed model to something more
modern. That relates to your DXLOCK2 question, but there's a lot more
to userspace managed gpu memory residency than just that. monitored
fences are another part. Also, to avoid a platform split we need to
figure out how to tie this back into the dma-buf and dma-fence
(including various uapi flavours) or it'll be made of fail. dx12 has
all that in some form, except 0 integration with the linux stuff we
have (no surprise, since linux isn't windows). Finally if we go to the
trouble of a completely revamped I think ioctls aren't a great idea,
something like iouring (the gossip name is drm_uring) would be a lot
better. Also for easier paravirt we'd need 0 cpu pointers in any such
new interface. Adding a few people who've been involved in these
discussions thus far, mostly under a drm/hmm.ko heading iirc.
I think the above are the really big ticket items around what's the
plan here and are we solving even the right problem.