Re: frequent lockups in 3.18rc4

From: Dave Young
Date: Fri Nov 21 2014 - 04:46:57 EST

On 11/20/14 at 12:38pm, Dave Jones wrote:
> On Thu, Nov 20, 2014 at 11:48:09AM -0500, Vivek Goyal wrote:
> > Can we try following and retry and see if some additional messages show
> > up on console and help us narrow down the problem.
> >
> > - Enable verbose boot messages. CONFIG_X86_VERBOSE_BOOTUP=y
> >
> > - Enable early printk in second kernel. (earlyprintk=ttyS0,115200).
> >
> > You can either enable early printk in first kernel and reboot. That way
> > second kernel will automatically have it enabled. Or you can edit
> > "/etc/sysconfig/kdump" and append earlyprintk=<> to KDUMP_COMMANDLINE_APPEND.
> > You will need to restart kdump service after this.
> >
> > - Enable some debug output during runtime from kexec purgatory. For that one
> > needs to pass additional arguments to /sbin/kexec. You can edit
> > /etc/sysconfig/kdump file and modify "KEXEC_ARGS" to pass additional
> > arguments to /sbin/kexec during kernel load. I use following for my
> > serial console.
> >
> > KEXEC_ARGS="--console-serial --serial=0x3f8 --serial-baud=115200"
> >
> > You will need to restart kdump service.
> The only serial port on this machine is usb serial, which doesn't have io ports.
> From my reading of the kexec man page, it doesn't look like I can tell
> it to use ttyUSB0.

Enabling ttyUSB0 still need hacks in dracut/kdump module to pack the usb serial
ko to initramfs and load it early. We can work on it in Fedora because it may benefit
to some late problems.

> And because it relies on usb being initialized, this probably isn't
> going to help too much with early boot.
> earlyprintk=tty0 didn't show anything extra after the sysrq-c oops.
> likewise, =ttyUSB0

earlyprintk=vga instead of tty0?
earlyprintk=efi in case efi boot.

earlyprintk=dbgp sometimes also helps but it's a little hard to setup because we
need a usb debugger. My nokia n900 works well as a debugger. But to find a usable
usb debug port in native host might fail, so this is my last try for earlyprintk :(

> I'm going to try bisecting the problem I'm debugging again, so I'm not
> going to dig into this much more today.

Another case what I know about kdump kernel issue is nouveau sometimes does not work
So if this is the case you can try add "rd.driver.blacklist=nouveau" to field
KDUMP_COMMANDLINE_APPEND in /etc/sysconfig/kdump. Or just add "nomodeset" in 1st
kernel grub cmdline so that 2nd kernel will reuse it to avoid load drm modules and
also earlyprintk=vga probably could show something.

