Re: [PATCH 1/1] sched/fair: Fix invalid pointer dereference in child_cfs_rq_on_list()
From: Dietmar Eggemann
Date: Wed Mar 05 2025 - 04:24:38 EST
On 05/03/2025 09:21, Vincent Guittot wrote:
> On Tue, 4 Mar 2025 at 18:00, Aboorva Devarajan <aboorvad@xxxxxxxxxxxxx> wrote:
>>
>> In child_cfs_rq_on_list(), leaf_cfs_rq_list.prev is expected to point to
>> a valid cfs_rq->leaf_cfs_rq_list in the hierarchy. However, when accessed
>> from the first node in a list, leaf_cfs_rq_list.prev can incorrectly point
>> back to the list head (rq->leaf_cfs_rq_list) instead of another
>> cfs_rq->leaf_cfs_rq_list.
>>
>> The function does not handle this case, leading to incorrect pointer
>> calculations and unintended memory accesses, which can result in a kernel
>> crash.
>>
>> A recent attempt to reorder fields in struct rq exposed this issue by
>> modifying memory offsets and affecting how pointer computations are
>> resolved. While the problem existed before, it was previously masked by
>> specific field arrangement. The reordering caused erroneous pointer
>> accesses, leading to a NULL dereference and a crash, as seen in the
I'm running tip/sched/core on arm64 and I still only see the wrong
pointer for 'prev_cfs_rq->tg->parent' in the 'prev ==
&rq->leaf_cfs_rq_list' case?
...
cpu=5 prev_cfs_rq->tg=ffff00097efb63a0 parent=0000000000000010
cfs_rq->tg=ffff000802084000
...
>> following trace:
>>
>> [ 2.152852] Call Trace:
>> [ 2.152855] __update_blocked_fair+0x45c/0x6a0 (unreliable)
>> [ 2.152862] sched_balance_update_blocked_averages+0x11c/0x24c
>> [ 2.152869] sched_balance_softirq+0x60/0x9c
>> [ 2.152876] handle_softirqs+0x148/0x3b4
>> [ 2.152884] do_softirq_own_stack+0x40/0x54
>> [ 2.152891] __irq_exit_rcu+0x18c/0x1b4
>> [ 2.152897] irq_exit+0x20/0x38
>> [ 2.152903] timer_interrupt+0x174/0x30c
>> [ 2.152910] decrementer_common_virt+0x28c/0x290
>> [ 2.059873] systemd[1]: Hostname set to ...
>> [ 2.152682] BUG: Unable to handle kernel data access on read at 0x100000125
>> [ 2.152717] Faulting instruction address: 0xc0000000001c0270
>> [ 2.152724] Oops: Kernel access of bad area, sig: 7 [#1]
>> ..
>>
>> To fix this, introduce a check to detect when prev points to the list head
>> (&rq->leaf_cfs_rq_list). If this condition is met, return early to prevent
>> the use of an invalid prev_cfs_rq.
>>
>> Fixes: fdaba61ef8a2 ("sched/fair: Ensure that the CFS parent is added after unthrottling")
>> Signed-off-by: Aboorva Devarajan <aboorvad@xxxxxxxxxxxxx>
>> ---
>> kernel/sched/fair.c | 7 +++++--
>> 1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 1c0ef435a7aa..a4daa7a9af0b 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -4045,12 +4045,15 @@ static inline bool child_cfs_rq_on_list(struct cfs_rq *cfs_rq)
>> {
>> struct cfs_rq *prev_cfs_rq;
>> struct list_head *prev;
>> + struct rq *rq;
>> +
>> + rq = rq_of(cfs_rq);
>>
>> if (cfs_rq->on_list) {
>> prev = cfs_rq->leaf_cfs_rq_list.prev;
>> + if (prev == &rq->leaf_cfs_rq_list)
>> + return false;
>
> what about the else case below , prev can also point to rq->leaf_cfs_rq_list
Should be the same issue IMHO. I'm not seeing it on my machine during
startup or while doing simple taskgroup tests though, 'cfs_rq->on_list'
is always 1 so far.
>> } else {
>> - struct rq *rq = rq_of(cfs_rq);
>> -
>> prev = rq->tmp_alone_branch;
>> }
>>
>> --
>> 2.43.5
>>