Re: [MEH PATCH] fs: sort out a stale comment about races between fd alloc and dup2

From: Al Viro
Date: Tue Dec 10 2024 - 13:15:53 EST


On Tue, Dec 10, 2024 at 05:48:40AM +0100, Mateusz Guzik wrote:

> Oh huh. I had seen that code before, did not mentally register there
> may be repeat file alloc/free calls due to repeat path_openat.
>
> Indeed it would be nice if someone(tm) sorted it out, but I don't see
> how this has any relation to installing the file early and thus having
> fget worry about it.

Other than the former being an obvious prereq for the latter? Not much...

> Suppose the "embryo"/"larval" file pointer is to be installed early
> and populated later. I don't see a benefit but do see a downside: this
> requires protection against close() on the fd (on top of dup2 needed
> now).
> The options that I see are:
> - install the file with a refcount of 2, let dup2/close whack it, do a
> fput in open to bring back to 1 or get rid of it if it raced (yuck)
> (freebsd is doing this)
> - dup2 is already special casing to not mess with it, add that to
> close as well (also yuck imo)

As a possibility (again, I'm not sold on the benefits of that scheme,
just looking into feasibility):
dup2() when evicting an embryo:
mark it evicted
remove from descriptor table
do nothing to refcount (in effect, transfer it to open())
then proceed as if it hadn't been there
[== pretend that dup2() always loses the race]
close() when running into an embryo
return -EBADF
[== pretend that close() always loses the race]
open() when it's done setting file up:
if opening failed
if not marked evicted
remove from descriptor table
fput()
return whatever error we've got
else
if marked evicted
fput()
return the descriptor
[== pretend that open() always wins the race]
"open" in the above stands for everything that opens a descriptor - socket(2),
pipe(2), eventfd(2), whatever.

> >From userspace side the only programs which can ever see EBUSY are
> buggy or trying to screw the kernel, so not a concern on that front.

Agreed. I'm not saying we should go that way.