Re: [Linaro-mm-sig] Re: [PATCH] epoll: try to be a _bit_ better about file lifetimes

From: Christian Brauner
Date: Wed May 08 2024 - 04:23:57 EST


On Tue, May 07, 2024 at 07:45:02PM +0200, Christian König wrote:
> Am 07.05.24 um 18:46 schrieb Linus Torvalds:
> > On Tue, 7 May 2024 at 04:03, Daniel Vetter <daniel@xxxxxxxx> wrote:
> > > It's really annoying that on some distros/builds we don't have that, and
> > > for gpu driver stack reasons we _really_ need to know whether a fd is the
> > > same as another, due to some messy uniqueness requirements on buffer
> > > objects various drivers have.
> > It's sad that such a simple thing would require two other horrid
> > models (EPOLL or KCMP).
> >
> > There'[s a reason that KCMP is a config option - *some* of that is
> > horrible code - but the "compare file descriptors for equality" is not
> > that reason.
> >
> > Note that KCMP really is a broken mess. It's also a potential security
> > hole, even for the simple things, because of how it ends up comparing
> > kernel pointers (ie it doesn't just say "same file descriptor", it
> > gives an ordering of them, so you can use KCMP to sort things in
> > kernel space).
> >
> > And yes, it orders them after obfuscating the pointer, but it's still
> > not something I would consider sane as a baseline interface. It was
> > designed for checkpoint-restore, it's the wrong thing to use for some
> > "are these file descriptors the same".
> >
> > The same argument goes for using EPOLL for that. Disgusting hack.
> >
> > Just what are the requirements for the GPU stack? Is one of the file
> > descriptors "trusted", IOW, you know what kind it is?
> >
> > Because dammit, it's *so* easy to do. You could just add a core DRM
> > ioctl for it. Literally just
> >
> > struct fd f1 = fdget(fd1);
> > struct fd f2 = fdget(fd2);
> > int same;
> >
> > same = f1.file && f1.file == f2.file;
> > fdput(fd1);
> > fdput(fd2);
> > return same;
> >
> > where the only question is if you also woudl want to deal with O_PATH
> > fd's, in which case the "fdget()" would be "fdget_raw()".
> >
> > Honestly, adding some DRM ioctl for this sounds hacky, but it sounds
> > less hacky than relying on EPOLL or KCMP.
> >
> > I'd be perfectly ok with adding a generic "FISAME" VFS level ioctl
> > too, if this is possibly a more common thing. and not just DRM wants
> > it.
> >
> > Would something like that work for you?
>
> Well the generic approach yes, the DRM specific one maybe. IIRC we need to
> be able to compare both DRM as well as DMA-buf file descriptors.
>
> The basic problem userspace tries to solve is that drivers might get the
> same fd through two different code paths.
>
> For example application using OpenGL/Vulkan for rendering and VA-API for
> video decoding/encoding at the same time.
>
> Both APIs get a fd which identifies the device to use. It can be the same,
> but it doesn't have to.
>
> If it's the same device driver connection (or in kernel speak underlying
> struct file) then you can optimize away importing and exporting of buffers
> for example.
>
> Additional to that it makes cgroup accounting much easier because you don't
> count things twice because they are shared etc...

One thing to keep in mind is that a generic VFS level comparing function
will only catch the obvious case where you have dup() equivalency as
outlined above by Linus. That's what most people are interested in and
that could easily replace most kcmp() use-cases for comparing fds.

But, of course there's the case where you have two file descriptors
referring to two different files that reference the same underlying
object (usually stashed in file->private_data).

For most cases that problem can ofc be solved by comparing the
underlying inode. But that doesn't work for drivers using the generic
anonymous inode infrastructure because it uses the same inode for
everything or for cases where the same underlying object can even be
represented by different inodes.

So for such cases a driver specific ioctl() to compare two fds will
be needed in addition to the generic helper.