On Fri, Jun 27, 2014 at 11:12:59AM -0400, Boris Ostrovsky wrote:
Yes, it fails because xen_late_init_mcelog() registers /dev/mcelog and (IYes, mcheck_init_device is device_initcall_sync() while
think) it happens before mcheck_init_device().
xen_late_init_mcelog() is device_initcall().
In other words, misc_register() expected to fail in mcheck/mce.c onSo
(privileged?) PV guests (provided right CONFIG_XEN_* is set).
cef12ee52b05 ("xen/mce: Add mcelog support for Xen platform")
made it this way so that xen's init routine runs first.
So it is not the case that misc_register() fails often on xen but it is
*supposed* to fail by design, when running in dom0. And *then* you need
the notifier *not* unregistered on the error path so that the timers do
get deleted properly.
Ok, I see it now. Frankly, I'm not really sure I want to rush this in
now because it might break something else, Who TF knows what.
Right now my gut feeling tells me we should still queue it for 3.17 and
have it run for a while in linux-next. We can backport it to stable
later after some testing...