There are some code paths in the kernel where arch_spin_lock() will be
called directly when the lock isn't expected to be contended and critical
section is short. For example, tracing_saved_cmdlines_size_read()
in kernel/trace/trace.c does that.
In most cases, preemption is also not disabled. This creates a problem
for the qspinlock slowpath which expects preemption to be disabled
to guarantee the safe use of per cpu qnodes structure. To work around
these special use cases, add a preemption count check in the slowpath
and do a simple spin-wait when preemption isn't disabled.
Fixes: a33fda35e3a7 ("Introduce a simple generic 4-byte queued spinlock")
Signed-off-by: Waiman Long <longman@xxxxxxxxxx>