Re: possible deadlock in lru_add_drain_all

From: Dmitry Vyukov
Date: Tue Oct 31 2017 - 09:56:07 EST


On Tue, Oct 31, 2017 at 4:51 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Tue, Oct 31, 2017 at 02:13:33PM +0100, Michal Hocko wrote:
>> On Mon 30-10-17 16:10:09, Peter Zijlstra wrote:
>
>> > However, that splat translates like:
>> >
>> > __cpuhp_setup_state()
>> > #0 cpus_read_lock()
>> > __cpuhp_setup_state_cpuslocked()
>> > #1 mutex_lock(&cpuhp_state_mutex)
>> >
>> >
>> >
>> > __cpuhp_state_add_instance()
>> > #2 mutex_lock(&cpuhp_state_mutex)
>>
>> this should be #1 right?
>
> Yes
>
>> > cpuhp_issue_call()
>> > cpuhp_invoke_ap_callback()
>> > #3 wait_for_completion()
>> >
>> > msr_device_create()
>> > ...
>> > #4 filename_create()
>> > #3 complete()
>> >
>> >
>> >
>> > do_splice()
>> > #4 file_start_write()
>> > do_splice_from()
>> > iter_file_splice_write()
>> > #5 pipe_lock()
>> > vfs_iter_write()
>> > ...
>> > #6 inode_lock()
>> >
>> >
>> >
>> > sys_fcntl()
>> > do_fcntl()
>> > shmem_fcntl()
>> > #5 inode_lock()
>
> And that #6
>
>> > shmem_wait_for_pins()
>> > if (!scan)
>> > lru_add_drain_all()
>> > #0 cpus_read_lock()
>> >
>> >
>> >
>> > Which is an actual real deadlock, there is no mixing of up and down.
>>
>> thanks a lot, this made it more clear to me. It took a while to
>> actually see 0 -> 1 -> 3 -> 4 -> 5 -> 0 cycle. I have only focused
>> on lru_add_drain_all while it was holding the cpus lock.
>
> Yeah, these things are a pain to read, which is why I always construct
> something like the above first.


I noticed that for a simple 2 lock deadlock lockdep prints only 2
stacks. FWIW in user-space TSAN we print 4 stacks for such deadlocks,
namely where A was locked, where B was locked under A, where B was
locked, where A was locked under B. It makes it easier to figure out
what happens. However, for this report it seems to be 8 stacks this
way. So it's probably hard either way.