Re: [PATCH V2] lockdep: fix deadlock issue between lockdep and rcu

From: Bart Van Assche
Date: Thu Feb 01 2024 - 12:23:20 EST

Next message: H. Peter Anvin: "RE: [PATCH v2] x86/boot: Add a message about ignored early NMIs"
Previous message: Alexey Klimov: "[PATCH 2/4] arm64: dts: exynos: gs101: add chipid node"
In reply to: Carlos Llamas: "Re: [PATCH V2] lockdep: fix deadlock issue between lockdep and rcu"
Next in thread: Boqun Feng: "Re: [PATCH V2] lockdep: fix deadlock issue between lockdep and rcu"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 1/16/24 23:48, Zhiguo Niu wrote:

There is a deadlock scenario between lockdep and rcu when
rcu nocb feature is enabled, just as following call stack:

Is it necessary to support lockdep for this kernel configuration or should we
rather forbid this combination by changing lib/Kconfig.debug?

/*
- * Schedule an RCU callback if no RCU callback is pending. Must be called with
- * the graph lock held.
- */
-static void call_rcu_zapped(struct pending_free *pf)
+* See if we need to queue an RCU callback, must called with
+* the lockdep lock held, returns false if either we don't have
+* any pending free or the callback is already scheduled.
+* Otherwise, a call_rcu() must follow this function call.
+*/

Why has the name been changed from "graph lock" into "lockdep lock"? I think
elsewhere in this source file it is called the "graph lock".

/*
- * If there's anything on the open list, close and start a new callback.
- */
- call_rcu_zapped(delayed_free.pf + delayed_free.index);
+ * If there's anything on the open list, close and start a new callback.
+ */
+ if (need_callback)
+ call_rcu(&delayed_free.rcu_head, free_zapped_rcu);

The comment above the if-statement refers to the call_rcu_zapped() function
while call_rcu_zapped() has been changed into call_rcu(). So the comment is
now incorrect.

Additionally, what guarantees that the above code won't be triggered
concurrently from two different threads? As you may know calling call_rcu()
twice before the callback has been started is not allowed. I think that can
happen with the above code.

Bart.

Next message: H. Peter Anvin: "RE: [PATCH v2] x86/boot: Add a message about ignored early NMIs"
Previous message: Alexey Klimov: "[PATCH 2/4] arm64: dts: exynos: gs101: add chipid node"
In reply to: Carlos Llamas: "Re: [PATCH V2] lockdep: fix deadlock issue between lockdep and rcu"
Next in thread: Boqun Feng: "Re: [PATCH V2] lockdep: fix deadlock issue between lockdep and rcu"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]