Re: net: cxgb4: Call Trace reported with PREEMPT_RT: BUG: using smp_processor_id() in preemptible [00000000] code: ethtool/78718

From: Luis Claudio R. Goncalves
Date: Tue Apr 23 2024 - 11:03:18 EST


On Tue, Apr 23, 2024 at 12:10:10AM -0400, John B. Wyatt IV wrote:
> Hello Raju, Hello Sebastian,
>
> Red Hat QE found this issue with cxgb4 only when the kernel has PREEMPT_RT set
> with the preempt-rt patchset:
>
> git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git
>
> We are also seeing this in the Real-time builds of RHEL9 and 8.
>
> The specific build is an internal build that was pulled from the mirror Clark
> Williams setup for Fedora and RHEL testing.
>
> https://gitlab.com/cki-project/kernel-ark/-/tree/os-build-rt?ref_type=heads
>
> We use the branch: os-build-rt
>
> I was unable to find the cause of this and I thought I should report it.
>
> Please let me if you have any questions or you need any testing done.
>
> Call trace is below:
>
> kernel-rt-6.9.0-0.rc4.f8dba31b0a82.38.test.eln136.x86_64
> BUG: using smp_processor_id() in preemptible [00000000] code: ethtool/78718
> caller is cxgb4_selftest_lb_pkt+0x3d/0x3a0 cxgb4
> Hardware name: Dell Inc. PowerEdge R750/0WT8Y6, BIOS 1.5.4 12/17/2021
> Call Trace:
> <TASK>
> dump_stack_lvl (lib/dump_stack.c:116)
> check_preemption_disabled (lib/smp_processor_id.c:49)
> cxgb4_selftest_lb_pkt+0x3d/0x3a0 cxgb4
> cxgb4_self_test+0x8f/0xe0 cxgb4
> ethtool_self_test (net/ethtool/ioctl.c:2002)
> __dev_ethtool (net/ethtool/ioctl.c:2997)

Hi John,

The patch below is untested but should fix the problem you reported:

======

cxgb4: fix smp_processor_id() usage in selftests

When PREEMPT_RT is enabled the following call can result in a "BUG: using
smp_processor_id() in preemptible [00000000] code: ethtool/xxxx" error
message:

ethtool_self_test()
cxgb4_self_test()
cxgb4_selftest_lb_pkt()
__netif_tx_lock(q->txq, smp_processor_id()); <--- BOOM

Replacing smp_processor_id() by raw_smp_processor_id() is safe in this
context given that __netif_tx_lock() is an inline function that takes a
spinlock and then uses the cpu value.

Signed-off-by: Luis Claudio R. Goncalves <lgoncalv@xxxxxxxxxx>

diff --git a/drivers/net/ethernet/chelsio/cxgb4/sge.c b/drivers/net/ethernet/chelsio/cxgb4/sge.c
index 1948b7bf99661..803dc62a4db04 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/sge.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/sge.c
@@ -2674,7 +2674,7 @@ int cxgb4_selftest_lb_pkt(struct net_device *netdev)
lb->loopback = 1;

q = &adap->sge.ethtxq[pi->first_qset];
- __netif_tx_lock(q->txq, smp_processor_id());
+ __netif_tx_lock(q->txq, raw_smp_processor_id());

reclaim_completed_tx(adap, &q->q, -1, true);
credits = txq_avail(&q->q) - ndesc;