RE: [PATCH] hpwdt: Fix kdump issue in hpwdt

From: Mingarelli, Thomas
Date: Mon Aug 27 2012 - 15:59:23 EST


The main issue here is when an NMI comes in (which is hpwdt's main focus...to source NMIs and then panic the box) and the system is configured for kdump. We want the kdump to succeed and if the iLO watchdog timer is left alone to keep running, the kdump will not succeed. It will be interrupted by an ASR. This change ensures that the iLO Watchdog timer is always stopped in the booting case (of any kernel) or when an NMI arrives and we are in the process of taking a kdump.


Tom

-----Original Message-----
From: Lars Marowsky-Bree [mailto:lmb@xxxxxxxx]
Sent: Monday, August 27, 2012 2:22 PM
To: Kani, Toshimitsu; wim@xxxxxxxxx; linux-watchdog@xxxxxxxxxxxxxxx
Cc: linux-kernel@xxxxxxxxxxxxxxx; Mingarelli, Thomas; stable@xxxxxxxxxxxxxxx
Subject: Re: [PATCH] hpwdt: Fix kdump issue in hpwdt

On 2012-08-27T12:52:24, Toshi Kani <toshi.kani@xxxxxx> wrote:

> kdump can be interrupted by watchdog timer when the timer is left
> activated on the crash kernel. Changed the hpwdt driver to disable
> watchdog timer at boot-time. This assures that watchdog timer is
> disabled until /dev/watchdog is opened, and prevents watchdog timer
> to be left running on the crash kernel.

How does this protect against the system hanging again in the crash
kernel, or possibly hardware caches to flush more data to shared
storage?

(I'm asking from the perspective of the hpwdt being used as a fencing
mechanism in a cluster setting.)

Or is the argument that it's "very unlikely" that a system in such a
state would not make it far enough into the crash kernel?


Regards,
Lars

--
Architect Storage/HA
SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/