Re: possible deadlock in lru_add_drain_all

From: Peter Zijlstra
Date: Tue Oct 31 2017 - 09:51:27 EST


On Tue, Oct 31, 2017 at 02:13:33PM +0100, Michal Hocko wrote:
> On Mon 30-10-17 16:10:09, Peter Zijlstra wrote:

> > However, that splat translates like:
> >
> > __cpuhp_setup_state()
> > #0 cpus_read_lock()
> > __cpuhp_setup_state_cpuslocked()
> > #1 mutex_lock(&cpuhp_state_mutex)
> >
> >
> >
> > __cpuhp_state_add_instance()
> > #2 mutex_lock(&cpuhp_state_mutex)
>
> this should be #1 right?

Yes

> > cpuhp_issue_call()
> > cpuhp_invoke_ap_callback()
> > #3 wait_for_completion()
> >
> > msr_device_create()
> > ...
> > #4 filename_create()
> > #3 complete()
> >
> >
> >
> > do_splice()
> > #4 file_start_write()
> > do_splice_from()
> > iter_file_splice_write()
> > #5 pipe_lock()
> > vfs_iter_write()
> > ...
> > #6 inode_lock()
> >
> >
> >
> > sys_fcntl()
> > do_fcntl()
> > shmem_fcntl()
> > #5 inode_lock()

And that #6

> > shmem_wait_for_pins()
> > if (!scan)
> > lru_add_drain_all()
> > #0 cpus_read_lock()
> >
> >
> >
> > Which is an actual real deadlock, there is no mixing of up and down.
>
> thanks a lot, this made it more clear to me. It took a while to
> actually see 0 -> 1 -> 3 -> 4 -> 5 -> 0 cycle. I have only focused
> on lru_add_drain_all while it was holding the cpus lock.

Yeah, these things are a pain to read, which is why I always construct
something like the above first.