Re: [PATCH] fs: clear close-on-exec flag as part of put_unused_fd()

From: Al Viro
Date: Wed Dec 11 2013 - 18:30:35 EST


On Wed, Dec 11, 2013 at 11:36:35PM +0100, Mateusz Guzik wrote:

> >From my reading this will break at least the following:
> fd = open(..., .. | O_CLOEXEC);
> dup2(whatever, fd);
>
> now fd has O_CLOEXEC even though it should not

Moreover, consider fork() done by a thread that shares descriptor
table with somebody else. Suppose it happens in the middle of
open() with O_CLOEXEC being done by another thread. We copy descriptor
table after descriptor had been reserved (and marked close-on-exec),
but before a reference to struct file has actually been inserted there.
This code
for (i = open_files; i != 0; i--) {
struct file *f = *old_fds++;
if (f) {
get_file(f);
} else {
/*
* The fd may be claimed in the fd bitmap but not yet
* instantiated in the files array if a sibling thread
* is partway through open(). So make sure that this
* fd is available to the new process.
*/
__clear_open_fd(open_files - i, new_fdt);
}
rcu_assign_pointer(*new_fds++, f);
}
spin_unlock(&oldf->file_lock);
in dup_fd() will clear the corresponding bit in open_fds, leaving close_on_exec
alone. Currently that's fine (we will override whatever had been in
close_on_exec when we reserve that descriptor again), but AFAICS with this
patch it will break.

Sure, it can be fixed up (ditto with dup2(), etc.), but what's the point?
Result will require more subtle reasoning to prove correctness and will
be more prone to breakage. Does that really yield visible performance
improvements that would be worth the extra complexity? After all, you
trade some writes to close_on_exec on descriptor reservation for unconditional
write on descriptor freeing; if anything, I would expect that you'll get
minor _loss_ from that change, assuming they'll be measurable in the first
place...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/