Re: possible deadlock in lru_add_drain_all
From: Byungchul Park
Date: Mon Oct 30 2017 - 06:09:32 EST
On Mon, Oct 30, 2017 at 09:22:03AM +0100, Michal Hocko wrote:
> [Cc Byungchul. The original full report is
> http://lkml.kernel.org/r/089e0825eec8955c1f055c83d476@xxxxxxxxxx]
>
> Could you have a look please? This smells like a false positive to me.
+cc peterz@xxxxxxxxxxxxx
Hello,
IMHO, the false positive was caused by the lockdep_map of 'cpuhp_state'
which couldn't distinguish between cpu-up and cpu-down.
And it was solved with the following commit by Peter and Thomas:
5f4b55e10645b7371322c800a5ec745cab487a6c
smp/hotplug: Differentiate the AP-work lockdep class between up and down
Therefore, we can avoid the false positive on later than the commit.
Peter and Thomas, could you confirm it?
Thanks,
Byungchul
> On Fri 27-10-17 15:42:34, Michal Hocko wrote:
> > On Fri 27-10-17 11:44:58, Dmitry Vyukov wrote:
> > > On Fri, Oct 27, 2017 at 11:34 AM, Michal Hocko <mhocko@xxxxxxxxxx> wrote:
> > > > On Fri 27-10-17 02:22:40, syzbot wrote:
> > > >> Hello,
> > > >>
> > > >> syzkaller hit the following crash on
> > > >> a31cc455c512f3f1dd5f79cac8e29a7c8a617af8
> > > >> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
> > > >> compiler: gcc (GCC) 7.1.1 20170620
> > > >> .config is attached
> > > >> Raw console output is attached.
> > > >
> > > > I do not see such a commit. My linux-next top is next-20171018
> > > >
> > > > [...]
> > > >> Chain exists of:
> > > >> cpu_hotplug_lock.rw_sem --> &pipe->mutex/1 --> &sb->s_type->i_mutex_key#9
> > > >>
> > > >> Possible unsafe locking scenario:
> > > >>
> > > >> CPU0 CPU1
> > > >> ---- ----
> > > >> lock(&sb->s_type->i_mutex_key#9);
> > > >> lock(&pipe->mutex/1);
> > > >> lock(&sb->s_type->i_mutex_key#9);
> > > >> lock(cpu_hotplug_lock.rw_sem);
> > > >
> > > > I am quite confused about this report. Where exactly is the deadlock?
> > > > I do not see where we would get pipe mutex from inside of the hotplug
> > > > lock. Is it possible this is just a false possitive due to cross release
> > > > feature?
> > >
> > >
> > > As far as I understand this CPU0/CPU1 scheme works only for simple
> > > cases with 2 mutexes. This seem to have larger cycle as denoted by
> > > "the existing dependency chain (in reverse order) is:" section.
> >
> > My point was that lru_add_drain_all doesn't take any external locks
> > other than lru_lock and that one is not anywhere in the chain AFAICS.
> >
> > --
> > Michal Hocko
> > SUSE Labs
>
> --
> Michal Hocko
> SUSE Labs