Re: [PATCH 7/7] watchdog: s3c2410: Let kernel kick watchdog

From: Sam Protsenko
Date: Sat Oct 30 2021 - 15:25:43 EST


On Sat, 30 Oct 2021 at 20:47, Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
>
> On 10/30/21 9:59 AM, Sam Protsenko wrote:
> > On Sat, 30 Oct 2021 at 18:14, Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
> >>
> >> On 10/30/21 7:29 AM, Sam Protsenko wrote:
> >>> On Fri, 29 Oct 2021 at 03:30, Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
> >>>>
> >>>> On 10/28/21 11:35 AM, Sam Protsenko wrote:
> >>>>> When "tmr_atboot" module param is set, the watchdog is started in
> >>>>> driver's probe. In that case, also set WDOG_HW_RUNNING bit to let
> >>>>> watchdog core driver know it's running. This way wathcdog core can kick
> >>>>> the watchdog for us (if CONFIG_WATCHDOG_HANDLE_BOOT_ENABLED option is
> >>>>> enabled), until user space takes control.
> >>>>>
> >>>>> Signed-off-by: Sam Protsenko <semen.protsenko@xxxxxxxxxx>
> >>>>> ---
> >>>>> drivers/watchdog/s3c2410_wdt.c | 26 +++++++++++++++-----------
> >>>>> 1 file changed, 15 insertions(+), 11 deletions(-)
> >>>>>
> >>>>> diff --git a/drivers/watchdog/s3c2410_wdt.c b/drivers/watchdog/s3c2410_wdt.c
> >>>>> index ca082b1226e3..9af014ff1468 100644
> >>>>> --- a/drivers/watchdog/s3c2410_wdt.c
> >>>>> +++ b/drivers/watchdog/s3c2410_wdt.c
> >>>>> @@ -732,6 +732,21 @@ static int s3c2410wdt_probe(struct platform_device *pdev)
> >>>>> wdt->wdt_device.bootstatus = s3c2410wdt_get_bootstatus(wdt);
> >>>>> wdt->wdt_device.parent = dev;
> >>>>>
> >>>>> + /*
> >>>>> + * If "tmr_atboot" param is non-zero, start the watchdog right now. Also
> >>>>> + * set WDOG_HW_RUNNING bit, so that watchdog core can kick the watchdog.
> >>>>> + *
> >>>>> + * If we're not enabling the watchdog, then ensure it is disabled if it
> >>>>> + * has been left running from the bootloader or other source.
> >>>>> + */
> >>>>> + if (tmr_atboot && started == 0) {
> >>>>> + dev_info(dev, "starting watchdog timer\n");
> >>>>> + s3c2410wdt_start(&wdt->wdt_device);
> >>>>> + set_bit(WDOG_HW_RUNNING, &wdt->wdt_device.status);
> >>>>> + } else if (!tmr_atboot) {
> >>>>> + s3c2410wdt_stop(&wdt->wdt_device);
> >>>>> + }
> >>>>> +
> >>>>
> >>>> This doesn't cover the case where the watchdog is already enabled by the BIOS.
> >>>> In that case, WDOG_HW_RUNNING won't be set, and the watchdog will time out
> >>>> if the userspace handler is not loaded fast enough. The code should consistently
> >>>> set WDOG_HW_RUNNING if the watchdog is running.
> >>>>
> >>>
> >>> As I understand, in the case when bootloader started the watchdog, the
> >>> driver just stops it. You can see it in the code you replied to.
> >>>
> >>> } else if (!tmr_atboot) {
> >>> s3c2410wdt_stop(&wdt->wdt_device);
> >>>
> >>> In other words, having "tmr_atboot" module param makes it irrelevant
> >>> whether bootloader enabled WDT or no.
> >>>
> >>
> >> Sure, but I am concerned about "if (tmr_atboot && started)", which doesn't
> >> set WDOG_HW_RUNNING with your current code, and I was looking for something
> >> like
> >>
> >> if (tmr_atboot) {
> >> if (!started) {
> >> dev_info(dev, "starting watchdog timer\n");
> >> s3c2410wdt_start(&wdt->wdt_device);
> >> }
> >> set_bit(WDOG_HW_RUNNING, &wdt->wdt_device.status);
> >> } else {
> >> s3c2410wdt_stop(&wdt->wdt_device);
> >> }
> >>
> >
> > Wow, I really overlooked that case. Nice catch! Not having '} else {'
> > section is vicious...
> >
> > Though if started != 0, it means s3c2410wdt_set_heartbeat() failed to
> > set wdd->timeout, and without that the watchdog core won't be able to
> > calculate correctly ping interval in watchdog_next_keepalive(), and
> > WDOG_HW_RUNNING bit won't do much good, right? So I'll probably just
> > call s3c2410wdt_stop() in that case, to be on the safe side.
> >
> > Also this 'started' variable name is misleading, I'll convert it to
> > "bool timeout_ok" while at it.
> >
>
> This driver is a mess :-(. "started" true means that the driver doesn't
> work as currently written because there is no known valid timeout. In
> reality, s3c2410wdt_set_heartbeat() should in that case select a valid
> timeout and set it. On top of that, a timeout value out of range should
> never be passed to it in the first place. The check for "if (timeout < 1)"
> is, in that context, pointless. The range check should happen in
> s3c2410wdt_max_timeout(). If that range check is invalid, ie if
> s3c2410wdt_set_heartbeat() fails even though the timeout is in the range
> of 1 ..s3c2410wdt_max_timeout(), s3c2410wdt_max_timeout() is buggy.
>
> The simplest fix (kludge/hack) might be to fail driver installation if
> s3c2410wdt_set_heartbeat() fails.
>

Thanks for suggesting that. I'll send that change along in v2.

> Guenter