Re: [watchdog] combine nmi_watchdog and softlockup

From: Frederic Weisbecker
Date: Thu Apr 08 2010 - 19:52:17 EST


On Tue, Apr 06, 2010 at 07:31:15PM +0400, Cyrill Gorcunov wrote:
> On Tue, Apr 06, 2010 at 04:13:30PM +0200, Frederic Weisbecker wrote:
> [...]
> > > +static int watchdog_enable(int cpu)
> > > +{
> > > + struct perf_event_attr *wd_attr;
> > > + struct perf_event *event = per_cpu(watchdog_ev, cpu);
> > > + struct task_struct *p = per_cpu(softlockup_watchdog, cpu);
> > > +
> > > + /* is it already setup and enabled? */
> > > + if (event && event->state > PERF_EVENT_STATE_OFF)
> > > + goto out;
> > > +
> > > + /* it is setup but not enabled */
> > > + if (event != NULL)
> > > + goto out_enable;
> > > +
> > > + /* Try to register using hardware perf events first */
> > > + wd_attr = &wd_hw_attr;
> > > + wd_attr->sample_period = hw_nmi_get_sample_period();
> > > + event = perf_event_create_kernel_counter(wd_attr, cpu, -1, watchdog_overflow_callback);
> > > + if (!IS_ERR(event)) {
> > > + printk(KERN_INFO "NMI watchdog enabled, takes one hw-pmu counter.\n");
> > > + goto out_save;
> > > + }
> > > +
> > > + /* hardware doesn't exist or not supported, fallback to software events */
> > > + printk(KERN_INFO "NMI watchdog: hardware not available, trying software events\n");
> > > + wd_attr = &wd_sw_attr;
> > > + wd_attr->sample_period = softlockup_thresh * NSEC_PER_SEC;
> > > + event = perf_event_create_kernel_counter(wd_attr, cpu, -1, watchdog_overflow_callback);
> >
> > I fear the cpu clock is not going to help you detecting any hard lockups.
> > If you're stuck in an interrupt or an irq disabled loop, your cpu clock is
> > not going to fire.
> >
>
> I guess it's not supposed to. For such cases only nmi irqs may help for which
> the perf events are there (/me need to check if we program apic timer for anything
> like that). But it should help for other deadlocks. Or I miss something?


Yeah but only a part of the hardlockup classes. Those that have interrupt
enabled.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/