Re: Questions about the patch 054aa8d439b9 ("fget: check that the fd still exists after getting a ref to it")

From: libaokun (A)
Date: Sun Jan 09 2022 - 20:26:09 EST


Happy New Year!

ping

在 2021/12/22 18:32, libaokun (A) 写道:
From: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Date: Wed, 1 Dec 2021 10:06:14 -0800
Subject: fget: check that the fd still exists after getting a ref to it

Jann Horn points out that there is another possible race wrt Unix domain
socket garbage collection, somewhat reminiscent of the one fixed in
commit cbcf01128d0a ("af_unix: fix garbage collect vs MSG_PEEK").

See the extended comment about the garbage collection requirements added
to unix_peek_fds() by that commit for details.

The race comes from how we can locklessly look up a file descriptor just
as it is in the process of being closed, and with the right artificial
timing (Jann added a few strategic 'mdelay(500)' calls to do that), the
Unix domain socket garbage collector could see the reference count
decrement of the close() happen before fget() took its reference to the
file and the file was attached onto a new file descriptor.

I analyzed this CVE and tried to reproduce it.

I guess he triggered it like the stack below.


close_fd                               |
 pick_file                             |
                                       | __fget_files
file = files_lookup_fd_rcu(files, fd); |
                                       | rcu_assign_pointer(fdt->fd[fd], NULL);
 filp_close                            |
  fput                                 |
                                       | get_file_rcu_many // ned ref>=1
   fput_many(file, 1);                 |
    file_free(file);                   |
                                       |  return file
                                       |  // read-after-free



If you want to successfully execute the get_file_rcu_many function,

the reference counting of the file is greater than or equal to 1 and

is greater than or equal to 2 after the execution.

However, close releases only one reference count and does not release the file,

so read-after-free does not occur. So how is the race triggered here?

The question has been pondered for a long time without any results.

Could I get more details (e.g. reproduction methods or stacks) from you ?

I would appreciate it if you could help me.


This is all (intentionally) correct on the 'struct file *' side, with
RCU lookups and lockless reference counting very much part of the
design. Getting that reference count out of order isn't a problem per
se.

But the garbage collector can get confused by seeing this situation of
having seen a file not having any remaining external references and then
seeing it being attached to an fd.

In commit cbcf01128d0a ("af_unix: fix garbage collect vs MSG_PEEK") the
fix was to serialize the file descriptor install with the garbage
collector by taking and releasing the unix_gc_lock.

That's not really an option here, but since this all happens when we are
in the process of looking up a file descriptor, we can instead simply
just re-check that the file hasn't been closed in the meantime, and just
re-do the lookup if we raced with a concurrent close() of the same file
descriptor.

Reported-and-tested-by: Jann Horn <jannh@xxxxxxxxxx>
Acked-by: Miklos Szeredi <mszeredi@xxxxxxxxxx>
Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
---
fs/file.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/fs/file.c b/fs/file.c
index 8627dacfc4246..ad4a8bf3cf109 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -858,6 +858,10 @@ loop:
file = NULL;
else if (!get_file_rcu_many(file, refs))
goto loop;
+ else if (files_lookup_fd_raw(files, fd) != file) {
+ fput_many(file, refs);
+ goto loop;
+ }
}
rcu_read_unlock();
-- cgit 1.2.3-1.el7

Looking forward to hearing from you.

Thank you!