Re: [PATCH] [WATCHDOG] Fix kdump when using hpwdt

From: Vivek Goyal
Date: Tue Nov 25 2008 - 09:29:09 EST


On Sun, Nov 23, 2008 at 02:15:24PM +0100, Bernhard Walle wrote:
> When the "hpwdt" module is loaded (even if the /dev/watchdog device is not
> opened), then kdump does not work. The panic kernel either does not start at
> all or crash in various places.
>
> The problem is that hpwdt_pretimeout is registered with register_die_notifier()
> with the highest possible priority. Because it returns NOTIFY_STOP, the
> crash_nmi_callback which is also registered with register_die_notifier() is
> never executed. This causes the shutdown of other CPUs to fail.
>
> Reverting the order is no option: The crash_nmi_callback executes HLT and so
> never returns normally. Because of that, it must be executed as last notifier,
> which currently is done.
>
> So, that patch returns NOTIFY_OK to keep the crash_nmi_callback executed.

Hi Bernhard,

Why does this handler need to run after a crash? IOW, even if kdump NMI
handler halts the cpu, and this handler never gets a chance to run, is
that an issue.

I am getting back to previous discussion of dropping the priority of this
hpwdt. You mentioned that dropping priority will not work as kdump handler
hlts the cpus. But my point is that kdump handler is registered
dynamically only after a system crash. Does hpwdt need to run then?

Above patch as such should fix the kdump issue (assuming the handler of
this driver will always return back), but I don't understand why does
it need to run after a crash?

Thanks
Vivek

>
>
> Signed-off-by: Bernhard Walle <bwalle@xxxxxxx>
> Cc: Wim Van Sebroeck <wim@xxxxxxxxx>
> Cc: Thomas Mingarelli <thomas.mingarelli@xxxxxx>
> Cc: Vivek Goyal <vgoyal@xxxxxxxxxx>
> ---
> drivers/watchdog/hpwdt.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/watchdog/hpwdt.c b/drivers/watchdog/hpwdt.c
> index a3765e0..21fe202 100644
> --- a/drivers/watchdog/hpwdt.c
> +++ b/drivers/watchdog/hpwdt.c
> @@ -482,7 +482,7 @@ static int hpwdt_pretimeout(struct notifier_block *nb, unsigned long ulReason,
> "Management Log for details.\n");
> }
>
> - return NOTIFY_STOP;
> + return NOTIFY_OK;
> }
>
> /*
> --
> 1.6.0.2
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/