Kernel bug in root=nfsroot codepath

From: John Z. Bohach
Date: Sat Jun 11 2011 - 23:10:30 EST


There is a problem with S5 and nfsroot.

The kernel will NOT enter S5 if and only if root=nfsroot.

With the same rootfs burned to disk, it works fine. I've tracked this
done to the

__raw_notifier_call_chain()

function. The call trace is sequence is:

kernel_power_off()
...
disable_nonboot_cpus()
_cpu_down()
...
cpu_die()
cpu_notify_nofail()
cpu_notify()
...
__cpu_notify()
__raw_notifier_call_chain()
notifier_call_chain()
?? strange second call from unknown location to:
__raw_notifier_call_chain()
then it just stops...

Machine does not hang as I can see NFS timeout messages after a few
minutes (probably interrupt context), but no further printk's are
manifest, and system stays in this state until physically reset and is
unresponsive.

This is the third time I'm posting this...any ideas? If I simply skip
the disable_nonboot_cpus() call, then it powers down fine. This only
happens with root=nfsroot. This happens with multiple kernels and goes
back to at least 2.6.16 and is there even with today's latest kernel.

To try to duplicate this, boot an nfsroot-ed machine into run-level 1,
and run 'halt -n -d -f -p'.

Thanks,
John
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/