Re: [PATCH v6 3/8] rust: file: add Rust abstraction for `struct file`

From: Alice Ryhl
Date: Tue May 28 2024 - 16:29:30 EST


On Tue, May 28, 2024 at 9:36 PM Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
>
> On Mon, May 27, 2024 at 04:03:56PM +0000, Alice Ryhl wrote:
>
> > > In other words, if current->files->count is equal to 1 at fdget() time
> > > we can skip incrementing refcount. Matching fdput() would need to
> > > skip decrement, of course. Note that we must record that (borrowed
> > > vs. cloned) in struct fd - the condition cannot be rechecked at fdput()
> > > time, since the table that had been shared at fdget() time might no longer
> > > be shared by the time of fdput().
> >
> > This is great! It matches my understanding. I didn't know the details
> > about current->files and task->files.
> >
> > You should copy this to the kernel documentation somewhere. :)
>
> Probably, after it's turned into something more coherent - and after
> the description of struct fd scope rules is corrected ;-/
>
> Correction in question: you _are_ allowed to move reference from
> descriptor table while in scope of struct fd; what you are not allowed
> is dropping that reference until the end of scope.

The patch you are commenting on contains a change to fs/file.c with
exactly that correction. I'm not sure if you noticed it - I should
probably have put it in its own commit to make it more obvious.

> That's what binder_do_fd_close() is about - binder_deferred_fd_close()
> is called in a struct fd scope (anything called from ->unlocked_ioctl()
> instances is). It *does* remove a struct file reference from
> descriptor table:
> twcb->file = file_close_fd(fd);
> moves that reference to twcb->file, and subsequent
> get_file(twcb->file);
> filp_close(twcb->file, current->files);
> completes detaching it from the table[*] and the reference itself
> is dropped via task_work, which is going to be executed on the
> way out to userland, definitely after we leave the scope of
> struct fd.

Yeah. If you look at previous versions of this patchset, it contains
Rust code for performing exactly that dance. I was asked to drop it
from the patch series, though.

> Incidentally, I'm very tempted to unexport close_fd(); in addition to
> being a source of bugs when called from ioctl on user-supplied descriptor
> it encourages racy crap - just look at e.g. 1819200166ce
> "drm/amdkfd: Export DMABufs from KFD using GEM handles", where we
> call drm_gem_prime_handle_to_fd(), immediately followed by
> dmabuf = dma_buf_get(fd);
> close_fd(fd);
> dup2() from another thread with guessed descriptor number as target and
> you've got a problem... It's not a violation of fdget() use rules
> (it is called from ioctl, but descriptor is guaranteed to be different
> from the one passed to ioctl(2)), but it's still wrong. Would take
> some work, though...

Wait, what's going on there? It adds the fd and then immediately
removes it again, or?

> "Detaching from the table" bit also needs documenting, BTW. If you look
> at that thing, you'll see that current->files is converted to fl_owner_t,
> which is an opaque pointer. What happens is that dnotify and POSIX lock
> use current->files as opaque tags (->dn_owner and ->flc_owner resp.) and
> filp_close() (well, filp_flush() these days) needs to be called to
> purge all of those associated with given struct file and given tag value.

Ah, yes, fl_owner_t being a void pointer rather than having a proper
type caused a bug in an early version of Rust Binder ...

Alice

> That needs to be done between removal of file reference from table and
> destruction of the table itself and it guarantees that those opaque references
> won't outlive the table (more importantly, don't survive until a different
> files_struct instance is allocated at the same address).
>
> [*] NB: might make sense to export filp_flush(), since that's what this
> sequence boils down to. We really need a better identifier, though -
> "filp" part is a leftover from OSDI, AFAICT; that's a hungarism for
> "file pointer" and it makes no sense. file_flush() would be better,
> IMO - or flush_file(), for that matter.