Re: [PATCH v3 2/2] watchdog: fix w83627hf_wdt clear timeout expired

From: Guenter Roeck
Date: Wed Apr 03 2013 - 00:21:51 EST


On Mon, Apr 01, 2013 at 09:59:00PM -0700, Tony Chung wrote:
> Thanks Guenter!
> I agree with you. My first reaction was also about a small watchdog
> server that will start in early boot process. There are pros and
> cons. For example, there are many types of watchdog devices such as
> ipmi_watchdog which can accept more than 255 seconds for timeout. So
> you really need udev to pick the correct watchdog driver. It could be
> very complex.
>
> Our requirement don't need watchdog protection during the boot process
> until application is fully up but a driver should not assume anything.

Ok, but then the BIOS should not enable the watchdog.

> Anyway, an unexpected reboot is definitely a bug that need to be
> fixed. It is really easily reproducible. Depending on your hardware

Agreed.

> and BIOS settings, just reboot the boot, wait for 5 minutes and then
> run "insmod w83627hf_wdt.ko". The box just reboot by itself. The
> watchdog sever is not even started.
>
Doesn't happen for me, as the watchdog is initially not enabled in my system,
and bit 4 is never set. And when it gets set, the system immediately reboots.

> This line is actually the original fix that is running over a year:
> outb_p(0, WDT_EFDR); /* disable to prevent reboot */
>
Unfortunately this turns off the watchdog if it was running and has triggered.

> When I tried to cleanup it up, I thought I don't need it but it
> turned out it was still needed.
> When I changed it from 0xC0 to 0xD0, it still reboot.
>
So it looks like the watchdog triggered, for some reason did not cause
a reset, but resetting the trigger flag does.

Ultimately, the new code turns the watchdog off if it was running, has already
triggered, but did not cause a reset. The ultimate effect is that the system
will hang if it gets stuck before the watchdog application is started later on.
Maybe that is not your application, but others wil defintely want that
protection.

So I don't think the solution is correct. We should find out why the watchdog
trigger did not cause a reset. Also, I think it would be better in this
situation (if watchdog has triggered) to restart it with the default timeout.

What is the exact chip type in your system ? I want to have a look into the
datasheet; maybe I can find out how it can trigger without causing a reset.

Thanks,
Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/