Re: [PATCH v2] mm/page_isolation: fix a deadlock with printk()
From: Petr Mladek
Date: Mon Oct 07 2019 - 05:05:58 EST
On Mon 2019-10-07 10:07:42, Michal Hocko wrote:
> On Fri 04-10-19 18:26:45, Qian Cai wrote:
> > It is unsafe to call printk() while zone->lock was held, i.e.,
> >
> > zone->lock --> console_lock
> >
> > because the console could always allocate some memory in different code
> > paths and form locking chains in an opposite order,
> >
> > console_lock --> * --> zone->lock
> >
> > As the result, it triggers lockdep splats like below and in different
> > code paths in this thread [1]. Since has_unmovable_pages() was only used
> > in set_migratetype_isolate() and is_pageblock_removable_nolock(). Only
> > the former will set the REPORT_FAILURE flag which will call printk().
> > Hence, unlock the zone->lock just before the dump_page() there where
> > when has_unmovable_pages() returns true, there is no need to hold the
> > lock anyway in the rest of set_migratetype_isolate().
> >
> > While at it, remove a problematic printk() in __offline_isolated_pages()
> > only for debugging as well which will always disable lockdep on debug
> > kernels.
>
> I do not think that removing the printk is the right long term solution.
> While I do agree that removing the debugging printk __offline_isolated_pages
> does make sense because it is essentially of a very limited use, this
> doesn't really solve the underlying problem. There are likely other
> printks from zone->lock. It would be much more saner to actually
> disallow consoles to allocate any memory while printk is called from an
> atomic context.
The current "standard" solution for these situations is to replace
the problematic printk() with printk_deferred(). It would deffer
the console handling.
Of course, this is a whack a mole approach. The long term solution
is to deffer printk() by default. We have finally agreed on this
few weeks ago on Plumbers conference. It is going to be added
together with fully lockless log buffer hopefully soon. It will
be part of upstreaming Real-Time related code.
> > The problem is probably there forever, but neither many developers will
> > run memory offline with the lockdep enabled nor admins in the field are
> > lucky enough yet to hit a perfect timing which required to trigger a
> > real deadlock. In addition, there aren't many places that call printk()
> > while zone->lock was held.
> >
> > WARNING: possible circular locking dependency detected
> > ------------------------------------------------------
> > test.sh/1724 is trying to acquire lock:
> > 0000000052059ec0 (console_owner){-...}, at: console_unlock+0x
> > 01: 328/0xa30
> >
> > but task is already holding lock:
> > 000000006ffd89c8 (&(&zone->lock)->rlock){-.-.}, at: start_iso
> > 01: late_page_range+0x216/0x538
>
> I am also wondering what does this lockdep report actually say. How come
> we have a dependency between a start_kernel path and a syscall?
My understanding is that these are different code paths. Where each
code paths shows one existing lock ordering.
IMHO, it is possible that these code paths could never run in
parallel. I guess that lockdep is not able to distinguish
code paths that are called only during boot and others
that are called only in fully booted system. That said,
I am not sure if this is the case here.
Best Regards,
Petr