net: cxgb4: Call Trace reported with PREEMPT_RT: BUG: using smp_processor_id() in preemptible [00000000] code: ethtool/78718

From: John B. Wyatt IV
Date: Tue Apr 23 2024 - 00:10:35 EST


Hello Raju, Hello Sebastian,

Red Hat QE found this issue with cxgb4 only when the kernel has PREEMPT_RT set
with the preempt-rt patchset:

git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git

We are also seeing this in the Real-time builds of RHEL9 and 8.

The specific build is an internal build that was pulled from the mirror Clark
Williams setup for Fedora and RHEL testing.

https://gitlab.com/cki-project/kernel-ark/-/tree/os-build-rt?ref_type=heads

We use the branch: os-build-rt

I was unable to find the cause of this and I thought I should report it.

Please let me if you have any questions or you need any testing done.

Call trace is below:

kernel-rt-6.9.0-0.rc4.f8dba31b0a82.38.test.eln136.x86_64
BUG: using smp_processor_id() in preemptible [00000000] code: ethtool/78718
caller is cxgb4_selftest_lb_pkt+0x3d/0x3a0 cxgb4
Hardware name: Dell Inc. PowerEdge R750/0WT8Y6, BIOS 1.5.4 12/17/2021
Call Trace:
<TASK>
dump_stack_lvl (lib/dump_stack.c:116)
check_preemption_disabled (lib/smp_processor_id.c:49)
cxgb4_selftest_lb_pkt+0x3d/0x3a0 cxgb4
cxgb4_self_test+0x8f/0xe0 cxgb4
ethtool_self_test (net/ethtool/ioctl.c:2002)
__dev_ethtool (net/ethtool/ioctl.c:2997)
? migrate_enable (./include/linux/preempt.h:480 (discriminator 3) ./include/linux/preempt.h:480 (discriminator 3) kernel/sched/core.c:2472 (discriminator 3))
? kmalloc_trace (./arch/x86/include/asm/jump_label.h:55 ./include/linux/memcontrol.h:1839 mm/slub.c:1980 mm/slub.c:3807 mm/slub.c:3845 mm/slub.c:3992)
dev_ethtool (net/ethtool/ioctl.c:3177)
dev_ioctl (net/core/dev_ioctl.c:724)
sock_do_ioctl (net/socket.c:1236)
__x64_sys_ioctl (fs/ioctl.c:51 fs/ioctl.c:904 fs/ioctl.c:890 fs/ioctl.c:890)
do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1))
? mod_objcg_state (mm/memcontrol.c:3421)
? migrate_enable (./include/linux/preempt.h:480 (discriminator 3) ./include/linux/preempt.h:480 (discriminator 3) kernel/sched/core.c:2472 (discriminator 3))
? try_charge_memcg (mm/memcontrol.c:2745)
? __mod_node_page_state (./include/linux/preempt.h:477 (discriminator 3) mm/vmstat.c:405 (discriminator 3))
? migrate_enable (./include/linux/preempt.h:480 (discriminator 3) ./include/linux/preempt.h:480 (discriminator 3) kernel/sched/core.c:2472 (discriminator 3))
? rt_spin_unlock (kernel/locking/rtmutex.c:230 (discriminator 5) kernel/locking/spinlock_rt.c:84 (discriminator 5))
? do_anonymous_page (./include/linux/pgtable.h:114 mm/memory.c:4490)
? __handle_mm_fault (mm/memory.c:3878 mm/memory.c:5300 mm/memory.c:5441)
? syscall_exit_to_user_mode (kernel/entry/common.c:221)
? __count_memcg_events (./include/linux/preempt.h:477 (discriminator 3) mm/memcontrol.c:704 (discriminator 3) mm/memcontrol.c:963 (discriminator 3))
? handle_mm_fault (mm/memory.c:5483 mm/memory.c:5622)
? do_user_addr_fault (arch/x86/mm/fault.c:1443 (discriminator 1))
? clear_bhb_loop (arch/x86/entry/entry_64.S:1539)
? clear_bhb_loop (arch/x86/entry/entry_64.S:1539)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
RIP: 0033:0x7f55216c557b
Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 75 68 0f 00 f7 d8 64 89 01 48
All code
========
0: ff (bad)
1: ff (bad)
2: ff 85 c0 79 9b 49 incl 0x499b79c0(%rbp)
8: c7 c4 ff ff ff ff mov $0xffffffff,%esp
e: 5b pop %rbx
f: 5d pop %rbp
10: 4c 89 e0 mov %r12,%rax
13: 41 5c pop %r12
15: c3 ret
16: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1)
1d: 00 00
1f: f3 0f 1e fa endbr64
23: b8 10 00 00 00 mov $0x10,%eax
28: 0f 05 syscall
2a:* 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax <-- trapping instruction
30: 73 01 jae 0x33
32: c3 ret
33: 48 8b 0d 75 68 0f 00 mov 0xf6875(%rip),%rcx # 0xf68af
3a: f7 d8 neg %eax
3c: 64 89 01 mov %eax,%fs:(%rcx)
3f: 48 rex.W

Code starting with the faulting instruction
===========================================
0: 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax
6: 73 01 jae 0x9
8: c3 ret
9: 48 8b 0d 75 68 0f 00 mov 0xf6875(%rip),%rcx # 0xf6885
10: f7 d8 neg %eax
12: 64 89 01 mov %eax,%fs:(%rcx)
15: 48 rex.W
RSP: 002b:00007ffd867a78f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007ffd867a7980 RCX: 00007f55216c557b
RDX: 00007ffd867a7990 RSI: 0000000000008946 RDI: 0000000000000003
RBP: 0000556fe43632e0 R08: 0000000000000003 R09: 0000000000000001
R10: 0000000000000fff R11: 0000000000000246 R12: 0000556fe43632a0
R13: 0000000000000018 R14: 0000000000000001 R15: 0000000000000000
</TASK>

--
Sincerely,
John Wyatt
Software Engineer, Core Kernel
Red Hat