[PATCH 5.15 165/172] sched/fair: Fix fault in reweight_entity

From: Greg Kroah-Hartman
Date: Mon Feb 14 2022 - 05:18:28 EST


From: Tadeusz Struk <tadeusz.struk@xxxxxxxxxx>

commit 13765de8148f71fa795e0a6607de37c49ea5915a upstream.

Syzbot found a GPF in reweight_entity. This has been bisected to
commit 4ef0c5c6b5ba ("kernel/sched: Fix sched_fork() access an invalid
sched_task_group")

There is a race between sched_post_fork() and setpriority(PRIO_PGRP)
within a thread group that causes a null-ptr-deref in
reweight_entity() in CFS. The scenario is that the main process spawns
number of new threads, which then call setpriority(PRIO_PGRP, 0, -20),
wait, and exit. For each of the new threads the copy_process() gets
invoked, which adds the new task_struct and calls sched_post_fork()
for it.

In the above scenario there is a possibility that
setpriority(PRIO_PGRP) and set_one_prio() will be called for a thread
in the group that is just being created by copy_process(), and for
which the sched_post_fork() has not been executed yet. This will
trigger a null pointer dereference in reweight_entity(), as it will
try to access the run queue pointer, which hasn't been set.

Before the mentioned change the cfs_rq pointer for the task has been
set in sched_fork(), which is called much earlier in copy_process(),
before the new task is added to the thread_group. Now it is done in
the sched_post_fork(), which is called after that. To fix the issue
the remove the update_load param from the update_load param() function
and call reweight_task() only if the task flag doesn't have the
TASK_NEW flag set.

Fixes: 4ef0c5c6b5ba ("kernel/sched: Fix sched_fork() access an invalid sched_task_group")
Reported-by: syzbot+af7a719bc92395ee41b3@xxxxxxxxxxxxxxxxxxxxxxxxx
Signed-off-by: Tadeusz Struk <tadeusz.struk@xxxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx
Link: https://lkml.kernel.org/r/20220203161846.1160750-1-tadeusz.struk@xxxxxxxxxx
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
---
kernel/sched/core.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)

--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1199,8 +1199,9 @@ int tg_nop(struct task_group *tg, void *
}
#endif

-static void set_load_weight(struct task_struct *p, bool update_load)
+static void set_load_weight(struct task_struct *p)
{
+ bool update_load = !(READ_ONCE(p->__state) & TASK_NEW);
int prio = p->static_prio - MAX_RT_PRIO;
struct load_weight *load = &p->se.load;

@@ -4358,7 +4359,7 @@ int sched_fork(unsigned long clone_flags
p->static_prio = NICE_TO_PRIO(0);

p->prio = p->normal_prio = p->static_prio;
- set_load_weight(p, false);
+ set_load_weight(p);

/*
* We don't need the reset flag anymore after the fork. It has
@@ -6902,7 +6903,7 @@ void set_user_nice(struct task_struct *p
put_prev_task(rq, p);

p->static_prio = NICE_TO_PRIO(nice);
- set_load_weight(p, true);
+ set_load_weight(p);
old_prio = p->prio;
p->prio = effective_prio(p);

@@ -7193,7 +7194,7 @@ static void __setscheduler_params(struct
*/
p->rt_priority = attr->sched_priority;
p->normal_prio = normal_prio(p);
- set_load_weight(p, true);
+ set_load_weight(p);
}

/*
@@ -9431,7 +9432,7 @@ void __init sched_init(void)
#endif
}

- set_load_weight(&init_task, false);
+ set_load_weight(&init_task);

/*
* The boot idle thread does lazy MMU switching as well: