Re: linux 5.14.3: free_user_ns causes NULL pointer dereference
From: Eric W. Biederman
Date: Mon Oct 04 2021 - 13:10:46 EST
Adding Rune Kleveland to the discussion as he also seems to have
reproduced the issue.
Alex and I have been starring at the code and the reports and this
bug is hiding well. Here is what we have figured out so far.
Both the warning from free_user_ns calling dec_ucount that Jordan Glover
reported and the KASAN error that Yu Zhao has reported appear to have
the same cause. Using a ucounts structure after it has been freed and
reallocated as something else.
I have just skimmed through the recent report from Rune Kleveland
and it appears also to be a use after free. Especially since the
second failure in the log is slub complaining about trying to free
the ucounts data structure.
We looked through the users of put_ucounts and we don't see any obvious
buggy users that would be freeing the data structure early.
Alex has tried to reproduce this so far is not having any luck.
Folks can you tell what compiler versions you are using and share your
kernel config with us? That might help.
The little debug diff below is my guess of what is happening. If the
folks who can reproduce this issue can try the patch below and let me
know if the warnings fire that would be appreciated. It is still not
enough to track down the bug but at least it will confirm my current
hypothesis about how things look before there is a use of memory after
it is freed.
Thank you,
Eric
diff --git a/kernel/cred.c b/kernel/cred.c
index f784e08c2fbd..e7ffaa3cf5a6 100644
--- a/kernel/cred.c
+++ b/kernel/cred.c
@@ -120,6 +120,12 @@ static void put_cred_rcu(struct rcu_head *rcu)
if (cred->group_info)
put_group_info(cred->group_info);
free_uid(cred->user);
+#if 1
+ if ((cred->ucounts == cred->user_ns->ucounts) &&
+ (atomic_read(&cred->ucounts->count) == 1)) {
+ WARN_ONCE(1, "put_cred_rcu: ucount count 1\n");
+ }
+#endif
if (cred->ucounts)
put_ucounts(cred->ucounts);
put_user_ns(cred->user_ns);
diff --git a/kernel/exit.c b/kernel/exit.c
index 91a43e57a32e..60fd88b34c1a 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -743,6 +743,13 @@ void __noreturn do_exit(long code)
if (unlikely(!tsk->pid))
panic("Attempted to kill the idle task!");
+#if 1
+ if ((tsk->cred->ucounts == tsk->cred->user_ns->ucounts) &&
+ (atomic_read(tsk->cred->ucounts->count) == 1)) {
+ WARN_ONCE(1, "do_exit: ucount count 1\n");
+ }
+#endif
+
/*
* If do_exit is called because this processes oopsed, it's possible
* that get_fs() was left as KERNEL_DS, so reset it to USER_DS before