Issue with rcu_read_lock and CONFIG_PREEMPT_RCU

From: Marciniszyn, Mike
Date: Wed Dec 08 2021 - 06:35:28 EST


As part of testing the 5.16 rc series we noticed a new BUG message originating from check_preemption_disabled().

We submitted a patch to move a call to smp_processor_id() into an rcu critical section within the same function.

See https://lore.kernel.org/linux-rdma/20211129191958.101968.87329.stgit@xxxxxxxxxxxxxxxxxxxxxxxxxxxx/T/#u.

Much to my surprise, additional testing still sees the BUG!

Additional testing has shown that an explicit preempt_disable()/preempt_enable() silences the warning when placed around the RCU critical section.

The RCU config is:

#
# RCU Subsystem
#
CONFIG_TREE_RCU=y
CONFIG_PREEMPT_RCU=y
# CONFIG_RCU_EXPERT is not set
CONFIG_SRCU=y
CONFIG_TREE_SRCU=y
CONFIG_TASKS_RCU_GENERIC=y
CONFIG_TASKS_RCU=y
CONFIG_TASKS_RUDE_RCU=y
CONFIG_TASKS_TRACE_RCU=y
CONFIG_RCU_STALL_COMMON=y
CONFIG_RCU_NEED_SEGCBLIST=y
CONFIG_RCU_NOCB_CPU=y
# end of RCU Subsystem


It looks like there is a difference between the checking in check_preemption_disabled() and the implicit preemption disabling in __rcu_read_lock().

The implicit disable looks like:

static void rcu_preempt_read_enter(void)
{
WRITE_ONCE(current->rcu_read_lock_nesting, READ_ONCE(current->rcu_read_lock_nesting) + 1);
}

The checking code uses the x86 define preempt_count():

static __always_inline void __preempt_count_add(int val)
{
raw_cpu_add_4(__preempt_count, val);
}

An explicit disable uses this x86 code:

static __always_inline void __preempt_count_add(int val)
{
raw_cpu_add_4(__preempt_count, val);
}

The difference seems to be the use of __preempt_count vs. rcu_read_lock_nesting.

This can't be good...

Mike