Re: [PATCH 2/3] kernel/workqueue: Surround work execution with shared lock annotations
From: Johannes Berg
Date: Thu Oct 25 2018 - 15:17:56 EST
On Thu, 2018-10-25 at 10:22 -0700, Bart Van Assche wrote:
> > What's this intended to change? I currently don't see how lockdep's
> > behaviour would differ with read==1, unless you actually tried to do
> > recursive locking, which isn't really possible?
> >
> > Or perhaps this is actually the right change for the issue described in
> > patch 1, where a work struct flushes another work on the same wq, and
> > that causes recursion of sorts? But that recursion should only happen if
> > the workqueues is actually marked as ordered, in which case it *is* in
> > fact wrong?
>
> How about modifying the wq->lockdep_map annotations only and not touching the
> work->lockdep_map annotations? My comment about concurrency in the patch
> description refers to a multithreaded workqueue executing multiple different
> work items concurrently. I am aware that great care has been taken in the
> workqueue implementation to ensure that each work item is executed by exactly
> one worker.
Again, I'm not sure what you'd want to achieve by that?
If you look at lockdep, read==1 really doesn't do all that much. Note
read is the 4th arg to lock_acquire() via lock_acquire_shared().
Hold time statistics are accounted differently, but that's meaningless
here.
check_deadlock() will behave differently - if read==2 (recursive read)
it will not complain about recursing into a previous read==1 acquire.
Not relevant here, because you can't recurse into the workqueue while
you're running on it.
Some in-IRQ things are different, but we can't be in-IRQ here.
Note that lockdep doesn't care about multiple threads "holding" the same
lock. Everything it does in terms of this tracking is per-thread, afaict
(uses "current" everywhere, and that's how the reports look like etc.).
So ultimately the only places where this read==1 would matter cannot be
hit (recursion within the same thread). Importantly, it doesn't try to
track across the whole system if you're holding the same lock twice.
That's actually impossible with most locks anyway (one would wait), but
when it does happen like it can here, it doesn't really do anything bad.
I don't think it actually *breaks* anything to make this a read/write
lock, because even with a read/write lock you still get a deadlock if
you do
read(A)
lock(B)
write(A)
lock(B)
or
write(A)
lock(B)
read(A)
lock(B)
since read/write are mutually exclusive, and you're not changing the
flush-side to also be a readlock. If you were, I'd argue that'd be wrong
since then you *don't* detect such deadlocks:
read(A)
lock(B)
read(A)
lock(B)
would not result in a complaint (I think), but in the workqueue case it
*should* because you can't flush the workqueue while holding a lock that
you also hold in a work that's executing on it.
So in the end, I just simply don't see a point in this, and I don't
think you've sufficiently explained why this change should be made.
johannes