Re: [PATCH] ucounts: Fix UCOUNT_RLIMIT_SIGPENDING counter leak

From: Linus Torvalds
Date: Fri Jul 02 2021 - 18:13:40 EST


On Fri, Jul 2, 2021 at 10:55 AM Alexey Gladkov <legion@xxxxxxxxxx> wrote:
>
> @@ -424,10 +424,10 @@ __sigqueue_alloc(int sig, struct task_struct *t, gfp_t gfp_flags,
> * changes from/to zero.
> */
> rcu_read_lock();
> - ucounts = task_ucounts(t);
> + ucounts = ucounts_new = task_ucounts(t);
> sigpending = inc_rlimit_ucounts(ucounts, UCOUNT_RLIMIT_SIGPENDING, 1);
> if (sigpending == 1)
> - ucounts = get_ucounts(ucounts);
> + ucounts_new = get_ucounts(ucounts);
> rcu_read_unlock();

I think this is still problematic.

If get_ucounts() fails, we can't just drop the RCU lock and (later)
use "ucounts" that we hold no reference to.

Or am I missing something? I'm not entirely sure about the lifetime of
that RCU protection, and I do note that "task_ucounts()" uses
"task_cred_xxx()", which already does
rcu_read_lock()/rcu_read_unlock() in the actual access.

So I'm thinking the code could/should be written something like this instead:

diff --git a/kernel/signal.c b/kernel/signal.c
index f6371dfa1f89..40781b197227 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -422,22 +422,33 @@ __sigqueue_alloc(int sig, struct task_struct
*t, gfp_t gfp_flags,
* NOTE! A pending signal will hold on to the user refcount,
* and we get/put the refcount only when the sigpending count
* changes from/to zero.
+ *
+ * And if the ucount rlimit overflowed, we do not get to use it at all.
*/
rcu_read_lock();
ucounts = task_ucounts(t);
sigpending = inc_rlimit_ucounts(ucounts, UCOUNT_RLIMIT_SIGPENDING, 1);
- if (sigpending == 1)
- ucounts = get_ucounts(ucounts);
+ switch (sigpending) {
+ case 1:
+ if (likely(get_ucounts(ucounts)))
+ break;
+
+ dec_rlimit_ucounts(ucounts, UCOUNT_RLIMIT_SIGPENDING, 1);
+ fallthrough;
+ case LONG_MAX:
+ rcu_read_unlock();
+ return NULL;
+ }
rcu_read_unlock();

- if (override_rlimit || (sigpending < LONG_MAX && sigpending <=
task_rlimit(t, RLIMIT_SIGPENDING))) {
+ if (override_rlimit || sigpending <= task_rlimit(t,
RLIMIT_SIGPENDING)) {
q = kmem_cache_alloc(sigqueue_cachep, gfp_flags);
} else {
print_dropped_signal(sig);
}

if (unlikely(q == NULL)) {
- if (ucounts && dec_rlimit_ucounts(ucounts,
UCOUNT_RLIMIT_SIGPENDING, 1))
+ if (dec_rlimit_ucounts(ucounts, UCOUNT_RLIMIT_SIGPENDING, 1))
put_ucounts(ucounts);
} else {
INIT_LIST_HEAD(&q->list);

(and no, I'm not sure it's a good idea to make that use a "switch()" -
maybe the LONG_MAX case should be a "if (unlikely())" thing after the
rcu_read_ulock() instead?

Hmm?

The alternate thing is to say "No, Linus, you're a nincompoop and
wrong, that RCU protection is a non-issue because we hold a reference
to the task, and task_ucounts() will not change, so the RCU read lock
doesn't do anything".

Although then I would think the rcu_read_lock/rcu_read_unlock here is
entirely pointless.

Linus