Re: [PATCH -next v3] mm/hotplug: silence a lockdep splat with printk()

From: Michal Hocko
Date: Thu Jan 16 2020 - 12:56:21 EST


On Thu 16-01-20 11:05:07, Qian Cai wrote:
>
>
> > On Jan 16, 2020, at 10:54 AM, Michal Hocko <mhocko@xxxxxxxxxx> wrote:
> >
> > On Thu 16-01-20 09:53:13, Qian Cai wrote:
> >>
> >>
> >>> On Jan 16, 2020, at 9:28 AM, Michal Hocko <mhocko@xxxxxxxxxx> wrote:
> >>>
> >>> On Wed 15-01-20 12:29:16, Qian Cai wrote:
> >>>> It is guaranteed to trigger a lockdep splat if calling printk() with
> >>>> zone->lock held because there are many places (tty, console drivers,
> >>>> debugobjects etc) would allocate some memory with another lock
> >>>> held which is proved to be difficult to fix them all.
> >>>
> >>> I am still not happy with the above much. What would say about something
> >>> like below instead?
> >>> "
> >>> It is not that hard to trigger lockdep splats by calling printk from
> >>> under zone->lock. Most of them are false positives caused by lock chains
> >>> introduced early in the boot process and they do not cause any real
> >>> problems. There are some console drivers which do allocate from the
> >>> printk context as well and those should be fixed. In any case false
> >>> positives are not that trivial to workaround and it is far from optimal
> >>> to lose lockdep functionality for something that is a non-issue.
> >>> <An example of such a false positive goes here>
> >>> "
> >>
> >> I feel like I repeated myself too many times. A call trace for one lock dependency
> >> is sometimes from early boot process because lockdep will save the first one it
> >> encountered, but it does not mean the lock dependency will only not happen in
> >> early boot. I spent some time to study those early boot call traces in the given
> >> lockdep splats, and it looks to me the lock dependency is also possible after
> >> the boot.
> >
> > Then state it explicitly with an example of the trace and explanation
> > that the deadlock is real. If the deadlock is real then it shouldn't be
> > really terribly hard to notice even without lockdep splats which get
> > disabled after the first false positive, right?
>
> A deadlock could be really hard to trigger though which needs a perfect
> timing between multiple threads.

All I am saying is: Do not speculate in changelog. Make clear arguments.
So far we have seen many false positives and that is stated in the
wording I have suggested. It is also explained why those suck. There is
also a note that _some_ consoles might indeed deadlock. Compare that to
the original changelog which doesn't really saying anything useful about
those lockdep splats.

I obviously do not insist on my wording but please make the changelog
clear on the actual problem and stick to facts.

Thanks!
--
Michal Hocko
SUSE Labs