Re: [syzbot] [cgroups?] general protection fault in rebuild_sched_domains_locked

From: Waiman Long

Date: Tue Feb 24 2026 - 00:40:33 EST


On 2/23/26 11:03 PM, Chen Ridong wrote:

On 2026/2/16 13:57, Waiman Long wrote:
On 2/15/26 4:05 PM, syzbot wrote:
Hello,

syzbot found the following issue on:

HEAD commit:    37a93dd5c49b Merge tag 'net-next-7.0' of git://git.kernel...
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1649d073980000
kernel config:  https://syzkaller.appspot.com/x/.config?x=a512b4a06724b76a
dashboard link: https://syzkaller.appspot.com/bug?extid=460792609a79c085f79f
compiler:       gcc (Debian 14.2.0-19) 14.2.0, GNU ld (GNU Binutils for
Debian) 2.44
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=152086e6580000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=139c2eef980000

Downloadable assets:
disk image:
https://storage.googleapis.com/syzbot-assets/0dedaafff2ad/disk-37a93dd5.raw.xz
vmlinux:
https://storage.googleapis.com/syzbot-assets/aa7fae081497/vmlinux-37a93dd5.xz
kernel image:
https://storage.googleapis.com/syzbot-assets/9096b39b53e1/bzImage-37a93dd5.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+460792609a79c085f79f@xxxxxxxxxxxxxxxxxxxxxxxxx

R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
R13: 00007fe00de15fac R14: 00007fe00de15fa0 R15: 00007fe00de15fa0
  </TASK>
Oops: general protection fault, probably for non-canonical address
0xdffffc0000000000: 0000 [#1] SMP KASAN PTI
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
CPU: 1 UID: 0 PID: 5994 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google
01/24/2026
RIP: 0010:bitmap_subset include/linux/bitmap.h:433 [inline]
RIP: 0010:cpumask_subset include/linux/cpumask.h:836 [inline]
RIP: 0010:rebuild_sched_domains_locked+0x2aa/0x980 kernel/cgroup/cpuset.c:967
Code: 7d 05 00 41 83 c4 01 89 de 48 83 c5 08 44 89 e7 e8 fb 76 05 00 41 39 dc
0f 8d 4c 04 00 00 e8 fd 7c 05 00 48 89 e8 48 c1 e8 03 <42> 80 3c 30 00 0f 85
1d 06 00 00 48 8b 04 24 48 23 45 00 31 ff 44
RSP: 0018:ffffc90003ecfbc0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000020
RDX: ffff888028de0000 RSI: ffffffff8200f003 RDI: ffffffff8df14f28
RBP: 0000000000000000 R08: 0000000000000cc0 R09: 00000000ffffffff
R10: ffffffff8e7d95b3 R11: 0000000000000001 R12: 0000000000000000
R13: 00000000000f4240 R14: dffffc0000000000 R15: 0000000000000000
FS:  000055555c694500(0000) GS:ffff8881246a5000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b2f463fff CR3: 000000003704c000 CR4: 00000000003526f0
Call Trace:
  <TASK>
  rebuild_sched_domains_cpuslocked kernel/cgroup/cpuset.c:983 [inline]
  rebuild_sched_domains+0x21/0x40 kernel/cgroup/cpuset.c:990
  sched_rt_handler+0xb5/0xe0 kernel/sched/rt.c:2911
  proc_sys_call_handler+0x47f/0x5a0 fs/proc/proc_sysctl.c:600
  new_sync_write fs/read_write.c:595 [inline]
  vfs_write+0x6ac/0x1070 fs/read_write.c:688
  ksys_write+0x12a/0x250 fs/read_write.c:740
  do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
  do_syscall_64+0x106/0xf80 arch/x86/entry/syscall_64.c:94
  entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fe00db9bf79
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48
89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73
01 c3 48 c7 c1 e8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fff27bcda88 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007fe00de15fa0 RCX: 00007fe00db9bf79
RDX: 00000000000000f6 RSI: 0000200000000000 RDI: 0000000000000003
RBP: 00007fff27bcdaf0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
R13: 00007fe00de15fac R14: 00007fe00de15fa0 R15: 00007fe00de15fa0
  </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:bitmap_subset include/linux/bitmap.h:433 [inline]
RIP: 0010:cpumask_subset include/linux/cpumask.h:836 [inline]
RIP: 0010:rebuild_sched_domains_locked+0x2aa/0x980 kernel/cgroup/cpuset.c:967
Code: 7d 05 00 41 83 c4 01 89 de 48 83 c5 08 44 89 e7 e8 fb 76 05 00 41 39 dc
0f 8d 4c 04 00 00 e8 fd 7c 05 00 48 89 e8 48 c1 e8 03 <42> 80 3c 30 00 0f 85
1d 06 00 00 48 8b 04 24 48 23 45 00 31 ff 44
RSP: 0018:ffffc90003ecfbc0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000020
RDX: ffff888028de0000 RSI: ffffffff8200f003 RDI: ffffffff8df14f28
RBP: 0000000000000000 R08: 0000000000000cc0 R09: 00000000ffffffff
R10: ffffffff8e7d95b3 R11: 0000000000000001 R12: 0000000000000000
R13: 00000000000f4240 R14: dffffc0000000000 R15: 0000000000000000
FS:  000055555c694500(0000) GS:ffff8881246a5000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b2f463fff CR3: 000000003704c000 CR4: 00000000003526f0
----------------
Code disassembly (best guess), 1 bytes skipped:
    0:    05 00 41 83 c4           add    $0xc4834100,%eax
    5:    01 89 de 48 83 c5        add    %ecx,-0x3a7cb722(%rcx)
    b:    08 44 89 e7              or     %al,-0x19(%rcx,%rcx,4)
    f:    e8 fb 76 05 00           call   0x5770f
   14:    41 39 dc                 cmp    %ebx,%r12d
   17:    0f 8d 4c 04 00 00        jge    0x469
   1d:    e8 fd 7c 05 00           call   0x57d1f
   22:    48 89 e8                 mov    %rbp,%rax
   25:    48 c1 e8 03              shr    $0x3,%rax
* 29:    42 80 3c 30 00           cmpb   $0x0,(%rax,%r14,1) <-- trapping
instruction
   2e:    0f 85 1d 06 00 00        jne    0x651
   34:    48 8b 04 24              mov    (%rsp),%rax
   38:    48 23 45 00              and    0x0(%rbp),%rax
   3c:    31 ff                    xor    %edi,%edi
   3e:    44                       rex.R
The cpuset.c:967 is:

    966         for (i = 0; i < ndoms; ++i) {
    967                 if (WARN_ON_ONCE(!cpumask_subset(doms[i],
cpu_active_mask)))
    968                         return;

The oops was caused by accessing doms[i] which was kmalloc'ed in
generate_sched_domains() by calling alloc_sched_domains() in
kernel/sched/topology.c. Looking at the console log just before the oops, I saw

[  124.398850][ T5994] FAULT_INJECTION: forcing a failure.
[  124.398850][ T5994] name failslab, interval 1, probability 0, space 0, times 1
[  124.434865][ T5994] CPU: 1 UID: 0 PID: 5994 Comm: syz.0.17 Not tainted
syzkaller #0 PREEMPT(full)
[  124.434909][ T5994] Hardware name: Google Google Compute Engine/Google
Compute Engine, BIOS Google 01/24/2026
[  124.434936][ T5994] Call Trace:
[  124.434947][ T5994]  <TASK>
[  124.434959][ T5994]  dump_stack_lvl+0x100/0x190
[  124.435026][ T5994]  should_fail_ex.cold+0x5/0xa
[  124.435062][ T5994]  ? rebuild_sched_domains_locked+0x51/0x980
[  124.435113][ T5994]  should_failslab+0xc2/0x120
[  124.435153][ T5994]  __kmalloc_noprof+0xe0/0x850
[  124.435195][ T5994]  rebuild_sched_domains_locked+0x51/0x980
[  124.435266][ T5994]  rebuild_sched_domains+0x21/0x40
[  124.435314][ T5994]  sched_rt_handler+0xb5/0xe0
[  124.435359][ T5994]  proc_sys_call_handler+0x47f/0x5a0
[  124.435413][ T5994]  ? __pfx_proc_sys_call_handler+0x10/0x10
[  124.435475][ T5994]  vfs_write+0x6ac/0x1070
[  124.435511][ T5994]  ? __pfx_proc_sys_write+0x10/0x10
[  124.435562][ T5994]  ? __pfx_vfs_write+0x10/0x10
[  124.435597][ T5994]  ? __pfx_do_sys_openat2+0x10/0x10
[  124.435664][ T5994]  ksys_write+0x12a/0x250
[  124.435696][ T5994]  ? __pfx_ksys_write+0x10/0x10
[  124.435730][ T5994]  ? do_user_addr_fault+0x8d6/0x12f0
[  124.435787][ T5994]  do_syscall_64+0x106/0xf80
[  124.435834][ T5994]  ? clear_bhb_loop+0x40/0x90
[  124.435875][ T5994]  entry_SYSCALL_64_after_hwframe+0x77/0x7f

So it looks like the oops may be expected. It may not be a bug in the cpuset
AFAICS.

Hi Longman,

Thank you for looking into this issue.

Since partition_sched_domains_locked can handle the situation where 'doms' is
NULL, I think we should make it robust and fix it.

The fix can be implemented as follows:

In cpuset.c at line 964:

for (i = 0; i < ndoms; ++i) {
- if (WARN_ON_ONCE(!cpumask_subset(doms[i], cpu_active_mask)))
+ if (doms && WARN_ON_ONCE(!cpumask_subset(doms[i],
+ cpu_active_mask)))
return;
}

The problem is that doms is not NULL. It is 0xdffffc0000000000 as shown in the dmesg log. So the null check here won't do any good in this particular case. In fact, there is already a null check right after alloc_sched_domains() above.

Cheers, Longman