Re: [PATCH 1/2] fs, close_range: add flag CLOSE_RANGE_CLOEXEC

From: Giuseppe Scrivano
Date: Tue Oct 13 2020 - 18:45:09 EST


Christian Brauner <christian.brauner@xxxxxxxxxx> writes:

> On Tue, Oct 13, 2020 at 04:06:08PM +0200, Giuseppe Scrivano wrote:
>
> Hey Guiseppe,
>
> Thanks for the patch!
>
>> When the flag CLOSE_RANGE_CLOEXEC is set, close_range doesn't
>> immediately close the files but it sets the close-on-exec bit.
>
> Hm, please expand on the use-cases a little here so people know where
> and how this is useful. Keeping the rationale for a change in the commit
> log is really important.
>
>>
>> Signed-off-by: Giuseppe Scrivano <gscrivan@xxxxxxxxxx>
>> ---
>
>> fs/file.c | 56 ++++++++++++++++++++++----------
>> include/uapi/linux/close_range.h | 3 ++
>> 2 files changed, 42 insertions(+), 17 deletions(-)
>>
>> diff --git a/fs/file.c b/fs/file.c
>> index 21c0893f2f1d..ad4ebee41e09 100644
>> --- a/fs/file.c
>> +++ b/fs/file.c
>> @@ -672,6 +672,17 @@ int __close_fd(struct files_struct *files, unsigned fd)
>> }
>> EXPORT_SYMBOL(__close_fd); /* for ksys_close() */
>>
>> +static unsigned int __get_max_fds(struct files_struct *cur_fds)
>> +{
>> + unsigned int max_fds;
>> +
>> + rcu_read_lock();
>> + /* cap to last valid index into fdtable */
>> + max_fds = files_fdtable(cur_fds)->max_fds;
>> + rcu_read_unlock();
>> + return max_fds;
>> +}
>> +
>> /**
>> * __close_range() - Close all file descriptors in a given range.
>> *
>> @@ -683,27 +694,23 @@ EXPORT_SYMBOL(__close_fd); /* for ksys_close() */
>> */
>> int __close_range(unsigned fd, unsigned max_fd, unsigned int flags)
>> {
>> - unsigned int cur_max;
>> + unsigned int cur_max = UINT_MAX;
>> struct task_struct *me = current;
>> struct files_struct *cur_fds = me->files, *fds = NULL;
>>
>> - if (flags & ~CLOSE_RANGE_UNSHARE)
>> + if (flags & ~(CLOSE_RANGE_UNSHARE | CLOSE_RANGE_CLOEXEC))
>> return -EINVAL;
>>
>> if (fd > max_fd)
>> return -EINVAL;
>>
>> - rcu_read_lock();
>> - cur_max = files_fdtable(cur_fds)->max_fds;
>> - rcu_read_unlock();
>> -
>> - /* cap to last valid index into fdtable */
>> - cur_max--;
>> -
>> if (flags & CLOSE_RANGE_UNSHARE) {
>> int ret;
>> unsigned int max_unshare_fds = NR_OPEN_MAX;
>>
>> + /* cap to last valid index into fdtable */
>> + cur_max = __get_max_fds(cur_fds) - 1;
>> +
>> /*
>> * If the requested range is greater than the current maximum,
>> * we're closing everything so only copy all file descriptors
>> @@ -724,16 +731,31 @@ int __close_range(unsigned fd, unsigned max_fd, unsigned int flags)
>> swap(cur_fds, fds);
>> }
>>
>> - max_fd = min(max_fd, cur_max);
>> - while (fd <= max_fd) {
>> - struct file *file;
>> + if (flags & CLOSE_RANGE_CLOEXEC) {
>> + struct fdtable *fdt;
>>
>> - file = pick_file(cur_fds, fd++);
>> - if (!file)
>> - continue;
>> + spin_lock(&cur_fds->file_lock);
>> + fdt = files_fdtable(cur_fds);
>> + cur_max = fdt->max_fds - 1;
>> + max_fd = min(max_fd, cur_max);
>> + while (fd <= max_fd)
>> + __set_close_on_exec(fd++, fdt);
>> + spin_unlock(&cur_fds->file_lock);
>> + } else {
>> + /* Initialize cur_max if needed. */
>> + if (cur_max == UINT_MAX)
>> + cur_max = __get_max_fds(cur_fds) - 1;
>
> The separation between how cur_fd is retrieved in the two branches makes
> the code more difficult to follow imho. Unless there's a clear reason
> why you've done it that way I would think that something like the patch
> I appended below might be a little clearer and easier to maintain(?).

Thanks for the review!

I've opted for this version as in the flags=CLOSE_RANGE_CLOEXEC case we
can read max_fds directly from the fds table and avoid doing it from the
RCU critical section as well. I'll change it in favor of more readable
code.

Giuseppe