Re: [PATCH 3/3] kernel/workqueue: Suppress a false positive lockdep complaint

From: Bart Van Assche
Date: Thu Oct 25 2018 - 11:55:54 EST


On Thu, 2018-10-25 at 17:34 +-0200, Johannes Berg wrote:
+AD4 On Thu, 2018-10-25 at 15:05 +-0000, Bart Van Assche wrote:
+AD4
+AD4 +AD4 +AEAAQA -2889,7 +-2893,7 +AEAAQA static bool start+AF8-flush+AF8-work(struct work+AF8-struct +ACo-work, struct wq+AF8-barrier +ACo-barr,
+AD4 +AD4 +ACo workqueues the deadlock happens when the rescuer stalls, blocking
+AD4 +AD4 +ACo forward progress.
+AD4 +AD4 +ACo-/
+AD4 +AD4 - if (+ACE-from+AF8-cancel +ACYAJg
+AD4 +AD4 +- if (+ACE-from+AF8-cancel +ACYAJg (pwq-+AD4-wq-+AD4-flags +ACY +AF8AXw-WQ+AF8-HAS+AF8-BEEN+AF8-USED) +ACYAJg
+AD4 +AD4 (pwq-+AD4-wq-+AD4-saved+AF8-max+AF8-active +AD0APQ 1 +AHwAfA pwq-+AD4-wq-+AD4-rescuer)) +AHs
+AD4 +AD4 lock+AF8-acquire+AF8-exclusive(+ACY-pwq-+AD4-wq-+AD4-lockdep+AF8-map, 0, 0, NULL,
+AD4 +AD4 +AF8-THIS+AF8-IP+AF8)+ADs
+AD4
+AD4 This also doesn't seem right to me. You shouldn't really care whether or
+AD4 not the workqueue has been used at this point, lockdep also doesn't do
+AD4 this for locks.
+AD4
+AD4 Any dependency you cause at some point can - at a future time - be taken
+AD4 into account when checking dependency cycles. Removing one arbitrarily
+AD4 just because you haven't actually executed anything +ACo-yet+ACo just removes
+AD4 knowledge from lockdep. In the general case, this isn't right. Just
+AD4 because you haven't executd anything here doesn't mean that it's
+AD4 +ACo-impossible+ACo to have executed something, right?

Please have a look at the call trace in the description of this patch and also
at the direct I/O code. The lockdep complaint in the description of this patch
really is a false positive. What I think needs further discussion is on how to
address this false positive - track whether or not a work queue has been used
or follow Tejun's proposal that I became aware of after I posted this patch,
namely introduce a new function for destroying a work queue that skips draining,
e.g. destroy+AF8-workqueue+AF8-skip+AF8-drain() (https://lkml.org/lkml/2018/10/24/2).

Thanks,

Bart.