Re: [RFC][PATCH 0/3] x86/nmi: Print all cpu stacks from NMI safely

From: Jiri Kosina
Date: Thu Jun 19 2014 - 19:27:33 EST


On Thu, 19 Jun 2014, Steven Rostedt wrote:

> > > > The idea basically is to *switch* what arch_trigger_all_cpu_backtrace()
> > > > and arch_trigger_all_cpu_backtrace_handler() are doing; i.e. use the NMI
> > > > as a way to stop all the CPUs (one by one), and let the CPU that is
> > > > sending the NMIs around to actually walk and dump the stacks of the CPUs
> > > > receiving the NMI IPI.
> > >
> > > And this is cleaner? Stopping a CPU via NMI and then what happens if
> > > something else goes wrong and that CPU never starts back up? This
> > > sounds like something that can cause more problems than it was
> > > reporting on.
> >
> > It's going to get NMI in exactly the same situations it does with the
> > current arch_trigger_all_cpu_backtrace(), the only difference being that
> > it doesn't try to invoke printk() from inside NMI. The IPI-NMI is used
> > solely as a point of synchronization for the stack dumping.
>
> Well, all CPUs are going to be spinning until the main CPU prints
> everything out. That's not quite the same thing as what it use to do.

Is there a reason fo all CPUs to be spinning until everything is printed
out? Every CPU will be spinning until his very stack is printed out, while
other CPUs will be running uninterrupted.

I don't think there is a need for a global stop_machine()-like
synchronization here. The printing CPU will be sending IPI to the CPU N+1
only after it has finished printing CPU N stacktrace.

Thanks,

--
Jiri Kosina
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/