Re: [REGRESSION] Boot hang with 939f04bec printk: enable interrupts before calling console_trylock_for_printk()

From: Jan Kara
Date: Mon Jul 21 2014 - 06:04:42 EST


On Sat 19-07-14 00:50:05, Andreas Bombe wrote:
> On Thu, Jul 17, 2014 at 10:31:37AM +0200, Jan Kara wrote:
> > On Wed 16-07-14 23:34:08, Andreas Bombe wrote:
> > > On Mon, Jul 14, 2014 at 10:35:27AM +0200, Jan Kara wrote:
> > > > On Sun 29-06-14 00:50:50, Andreas Bombe wrote:
> > > > > None of the post 3.15 kernel boot for me. They all hang at the GRUB
> > > > > screen telling me it loaded and started the kernel, but the kernel
> > > > > itself stops before it prints anything (or even replaces the GRUB
> > > > > background graphics).
> > > > >
> > > > > I bisected it down to 939f04bec1a4ef6ba4370b0f34b01decc844b1b1 "printk:
> > > > > enable interrupts before calling console_trylock_for_printk()".
> > > > > Reverting that patch on the latest kernel (git 24b414d5a7) allows me to
> > > > > boot normally. I fixed the conflict in the revert by leaving in the "if
> > > > > (in_sched) return printed_len;".
> > > > >
> > > > > I have the "early printk via the EFI framebuffer" option enabled,
> > > > > disabling it made no difference however.
> > > > Thanks for report. I've been on vacation so I'm replying with a delay. I
> > > > believe this is one of the issues where this patch just uncovers underlying
> > > > problem - I belive lockdep tries to report some locking issue in console
> > > > driver code (this patch increased lockdep coverage of console driver code)
> > > > however we are holding some locks in printk code which make lockdep
> > > > deadlock. Can you try running with the attached patch?
> > >
> > > EUNABLE
> > >
> > > You forgot to attach a patch.
> > Bah, sorry. Attaching now.
>
> I don't see anything in /sys/kernel/debug/tracing/trace_pipe or
> .../trace (besides the header) with your patch applied. In case you
> meant to test it with the problematic printk change, I also tried with
> the revert reverted. That still hangs as before without any error report
> to see.
Yes, I meant testing my lockdep patch with the problematic printk change.
Thanks for having a look. I'm puzzled why it didn't help.

> I checked the kernel logs and there is also no lockdep report anywhere.
> I get the "trace_printk() being used" notice but nothing else of
> interest around there. Though the notice should mean trace_printk() was
> used at least once?
Yes. Anyway, I'd be grateful if you could run one more test for me so
that I can better understand what's going on. Can you take recent vanilla
kernel (with the revert) and apply attached patch to it? It again enables
interrupts when calling console_unlock() but keeps lockdep coverage
unchanged. It helped Sasha so I want to see whether your case is similar or
different. Thanks!

Honza
--
Jan Kara <jack@xxxxxxx>
SUSE Labs, CR