Re: [PATCH v3 08/11] watchdog/hpwdt: Programable Pretimeout NMI

From: Jerry Hoemann
Date: Fri Feb 16 2018 - 18:46:26 EST


On Fri, Feb 16, 2018 at 12:34:40PM -0800, Guenter Roeck wrote:
> On Thu, Feb 15, 2018 at 04:43:57PM -0700, Jerry Hoemann wrote:
> > Make whether or not the hpwdt watchdog delivers a pretimeout NMI
> > programable by the user.
> >
> > The underlying iLO hardware is programmable as to whether or not
> > a pre-timeout NMI is delivered to the system before the iLO resets
> > the system. However, the iLO does not allow for programming the
> > length of time that NMI is delivered before the system is reset.
> >
> > Hence, in hpwdt_set_pretimeout, val == 0 disables the NMI. Any
> > non-zero value sets the pretimeout length to what the hardware
> > supports.
> >
> > Signed-off-by: Jerry Hoemann <jerry.hoemann@xxxxxxx>
> > ---
> > drivers/watchdog/hpwdt.c | 42 ++++++++++++++++++++++++++++++++++++------
> > 1 file changed, 36 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/watchdog/hpwdt.c b/drivers/watchdog/hpwdt.c
> > index da9a04101814..dc0ad20738ed 100644
> > --- a/drivers/watchdog/hpwdt.c
> > +++ b/drivers/watchdog/hpwdt.c
> > @@ -28,12 +28,15 @@
> > #define TICKS_TO_SECS(ticks) ((ticks) * 128 / 1000)
> > #define HPWDT_MAX_TIMER TICKS_TO_SECS(65535)
> > #define DEFAULT_MARGIN 30
> > +#define PRETIMEOUT_SEC 9
> >
> > static unsigned int soft_margin = DEFAULT_MARGIN; /* in seconds */
> > -static unsigned int reload; /* the computed soft_margin */
> > static bool nowayout = WATCHDOG_NOWAYOUT;
> > #ifdef CONFIG_HPWDT_NMI_DECODING
> > static unsigned int allow_kdump = 1;
> > +static bool pretimeout = 1;
> > +#else
> > +static bool pretimeout;
> > #endif
> >
> static bool pretimeout = IS_ENABLED(CONFIG_HPWDT_NMI_DECODING);

ack. will do.

>
> > static void __iomem *pci_mem_addr; /* the PCI-memory address */
> > @@ -55,10 +58,12 @@ static struct watchdog_device hpwdt_dev;
> > */
> > static int hpwdt_start(struct watchdog_device *dev)
> > {
> > - reload = SECS_TO_TICKS(dev->timeout);
> > + int control = 0x81 | (pretimeout ? 0x4 : 0);
> > + int reload = SECS_TO_TICKS(dev->timeout);
> >
> > + dev_dbg(dev->parent, "start watchdog 0x%08x:0x%02x\n", reload, control);
> > iowrite16(reload, hpwdt_timer_reg);
> > - iowrite8(0x85, hpwdt_timer_con);
> > + iowrite8(control, hpwdt_timer_con);
> >
> > return 0;
> > }
> > @@ -67,6 +72,8 @@ static int hpwdt_stop(struct watchdog_device *dev)
> > {
> > unsigned long data;
> >
> > + dev_dbg(dev->parent, "stop watchdog\n");
> > +
> Unrelated.
>
> > data = ioread8(hpwdt_timer_con);
> > data &= 0xFE;
> > iowrite8(data, hpwdt_timer_con);
> > @@ -75,8 +82,9 @@ static int hpwdt_stop(struct watchdog_device *dev)
> >
> > static int hpwdt_ping(struct watchdog_device *dev)
> > {
> > - reload = SECS_TO_TICKS(dev->timeout);
> > + int reload = SECS_TO_TICKS(dev->timeout);
> >
> > + dev_dbg(dev->parent, "ping watchdog 0x%08x\n", reload);
>
> Unrelated. If you want to add debug messages, please do it
> in a separate patch.


Different patch, but same set? I'll move these (and ones from earlier
patch to a new separate patch later in set.)

>
> > iowrite16(reload, hpwdt_timer_reg);
> >
> > return 0;
> > @@ -98,12 +106,21 @@ static int hpwdt_settimeout(struct watchdog_device *dev, unsigned int val)
> > }
> >
> > #ifdef CONFIG_HPWDT_NMI_DECODING /* { */
> > +static int hpwdt_set_pretimeout(struct watchdog_device *dev, unsigned int val)
> > +{
> > + if (val && (val != PRETIMEOUT_SEC)) {
>
> Unnecessary ( )


There are several things going on here. I'm not sure which one the above
comment is intended.

While a pretimeout NMI isn't required by the HW to be enabled, if enabled the
length of pretimeout is fixed by HW.

I didn't see anything in the API that would allow us to communicate to
the user this "feature." timeout at leasst has both min_timeout and max_timeout, but
I didn't see similar for pretimeout. I also don't think its reasonable to fail
here if the requested value is not 9 as the user really has no way of knowing what
the valid range of pretimeout values are. So I accept, any non-zero value
for pretimeout, but then set pretimeout to be 9.

But at the same time, I don't like to siliently change a human request
w/o at least warning.



>
> The actual timeout can be a value smaller than 9 seconds.
> Minimum is 1 second. What happens if the user configures
> a timeout of less than 9 seconds as well as a pretimeout ?
> Will it fire immediately ?

The architecture is silient on this issue. My experience with
this is that if timeout < 9 seconds, the NMI is not issued.
System resets when the timeout expires. This could be implementation
dependent.

Note, this is not a new issue.

I thought about setting the min timeout to ten seconds to avoid this situation.

I haven't dug into the various user level clients of watchdog so I'm not sure
what the impact of making this change would be to them.


>
> > + dev_info(dev->parent, "Setting pretimeout to %d\n", PRETIMEOUT_SEC);
>
> Please no ongoing logging noise. This can easily be abused to clog
> the kernel log.

Good point. I will look at WARN_ONCE or something similar.

>
> > + val = PRETIMEOUT_SEC;
> > + }
> > + dev->pretimeout = val;
> > + pretimeout = val ? 1 : 0;
>
> pretimeout = !!val;
>

--

-----------------------------------------------------------------------------
Jerry Hoemann Software Engineer Hewlett Packard Enterprise
-----------------------------------------------------------------------------