Re: [PATCH v2] mm/page_isolation: fix a deadlock with printk()

From: Michal Hocko
Date: Tue Oct 08 2019 - 14:36:57 EST


On Tue 08-10-19 12:08:37, Qian Cai wrote:
> On Tue, 2019-10-08 at 14:56 +0200, Christian Borntraeger wrote:
> > Adding Peter Oberparleiter.
> > Peter, can you have a look?
> >
> > On 08.10.19 10:27, Michal Hocko wrote:
> > > On Tue 08-10-19 09:43:57, Petr Mladek wrote:
> > > > On Mon 2019-10-07 16:49:37, Michal Hocko wrote:
> > > > > [Cc s390 maintainers - the lockdep is http://lkml.kernel.org/r/1570228005-24979-1-git-send-email-cai@xxxxxx
> > > > > Petr has explained it is a false positive
> > > > > http://lkml.kernel.org/r/20191007143002.l37bt2lzqtnqjqxu@xxxxxxxxxxxxxxx]
> > > > > On Mon 07-10-19 16:30:02, Petr Mladek wrote:
> > > > > [...]
> > > > > > I believe that it cannot really happen because:
> > > > > >
> > > > > > static int __init
> > > > > > sclp_console_init(void)
> > > > > > {
> > > > > > [...]
> > > > > > rc = sclp_rw_init();
> > > > > > [...]
> > > > > > register_console(&sclp_console);
> > > > > > return 0;
> > > > > > }
> > > > > >
> > > > > > sclp_rw_init() is called before register_console(). And
> > > > > > console_unlock() will never call sclp_console_write() before
> > > > > > the console is registered.
> > > > > >
> > > > > > AFAIK, lockdep only compares existing chain of locks. It does
> > > > > > not know about console registration that would make some
> > > > > > code paths mutually exclusive.
> > > > > >
> > > > > > I believe that it is a false positive. I do not know how to
> > > > > > avoid this lockdep report. I hope that it will disappear
> > > > > > by deferring all printk() calls rather soon.
> > > > >
> > > > > Thanks a lot for looking into this Petr. I have also checked the code
> > > > > and I really fail to see why the allocation has to be done under the
> > > > > lock in the first place. sclp_read_sccb and sclp_init_sccb are global
> > > > > variables but I strongly suspect that they need a synchronization during
> > > > > early init, callbacks are registered only later IIUC:
> > > >
> > > > Good idea. It would work when the init function is called only once.
> > > > But see below.
> > > >
> > > > > diff --git a/drivers/s390/char/sclp.c b/drivers/s390/char/sclp.c
> > > > > index d2ab3f07c008..4b1c033e3255 100644
> > > > > --- a/drivers/s390/char/sclp.c
> > > > > +++ b/drivers/s390/char/sclp.c
> > > > > @@ -1169,13 +1169,13 @@ sclp_init(void)
> > > > > unsigned long flags;
> > > > > int rc = 0;
> > > > >
> > > > > + sclp_read_sccb = (void *) __get_free_page(GFP_ATOMIC | GFP_DMA);
> > > > > + sclp_init_sccb = (void *) __get_free_page(GFP_ATOMIC | GFP_DMA);
> > > > > spin_lock_irqsave(&sclp_lock, flags);
> > > > > /* Check for previous or running initialization */
> > > > > if (sclp_init_state != sclp_init_state_uninitialized)
> > > > > goto fail_unlock;
> > > >
> > > > It seems that sclp_init() could be called several times in parallel.
> > > > I see it called from sclp_register() and sclp_initcall().
> > >
> > > Interesting. Something for s390 people to answer I guess.
> > > Anyway, this should be quite trivial to workaround by a cmpxch or alike.
> > >
>
> The above fix is simply insufficient,

Isn't this yet another init time lockdep false possitive?

> 00: [    3.654337] -> #3 (console_owner){....}:                                 
> 00: [    3.654343]        lock_acquire+0x21a/0x468                              
> 00: [    3.654345]        console_unlock+0x3a6/0xa30                            
> 00: [    3.654346]        vprintk_emit+0x184/0x3c8                              
> 00: [    3.654348]        vprintk_default+0x44/0x50                             
> 00: [    3.654349]        printk+0xa8/0xc0                                      
> 00: [    3.654351]        get_random_u64+0x40/0x108                             
> 00: [    3.654360]        add_to_free_area_random+0x188/0x1c0                   
> 00: [    3.654364]        free_one_page+0x72/0x128                              
> 00: [    3.654366]        __free_pages_ok+0x51c/0xca0                           
> 00: [    3.654368]        memblock_free_all+0x30a/0x3b0                         
> 00: [    3.654370]        mem_init+0x84/0x200                                   
> 00: [    3.654371]        start_kernel+0x384/0x6a0                              
> 00: [    3.654373]        startup_continue+0x70/0xd0                            

This one is actually a nice example why trying to get printk out of the
zone->lock is simply not viable. This one is likely a printk to warn
that the random pool is not fully intiailized. Just because the
allocator tries to randomize the initial free memory pool. You are not
going to remove that printk, right?

I fully agree that this class of lockdep splats are annoying especially
when they make the lockdep unusable but please discuss this with lockdep
maintainers and try to find some solution rather than go and try to
workaround the problem all over the place. If there are places that
would result in a cleaner code then go for it but please do not make the
code worse just because of a non existent problem flagged by a false
positive.

--
Michal Hocko
SUSE Labs