Re: [PATCHv3 3/4] watchdog: rti-wdt: attach to running watchdog during probe

From: Guenter Roeck
Date: Tue Jul 14 2020 - 03:06:22 EST


On 7/13/20 11:50 PM, Tero Kristo wrote:
> On 14/07/2020 08:15, Guenter Roeck wrote:
>> On 7/13/20 6:18 AM, Tero Kristo wrote:
>>> If the RTI watchdog is running already during probe, the driver must
>>> configure itself to match the HW. Window size and timeout is probed from
>>> hardware, and the last keepalive ping is adjusted to match it also.
>>>
>>> Signed-off-by: Tero Kristo <t-kristo@xxxxxx>
>>> ---
>>> Â drivers/watchdog/rti_wdt.c | 111 +++++++++++++++++++++++++++++++++----
>>> Â 1 file changed, 101 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/drivers/watchdog/rti_wdt.c b/drivers/watchdog/rti_wdt.c
>>> index d456dd72d99a..45dfc546e371 100644
>>> --- a/drivers/watchdog/rti_wdt.c
>>> +++ b/drivers/watchdog/rti_wdt.c
>>> @@ -35,7 +35,11 @@
>>> Â Â #define RTIWWDRX_NMIÂÂÂ 0xa
>>> Â -#define RTIWWDSIZE_50PÂÂÂ 0x50
>>> +#define RTIWWDSIZE_50PÂÂÂÂÂÂÂ 0x50
>>> +#define RTIWWDSIZE_25PÂÂÂÂÂÂÂ 0x500
>>> +#define RTIWWDSIZE_12P5ÂÂÂÂÂÂÂ 0x5000
>>> +#define RTIWWDSIZE_6P25ÂÂÂÂÂÂÂ 0x50000
>>> +#define RTIWWDSIZE_3P125ÂÂÂ 0x500000
>>> Â Â #define WDENABLE_KEYÂÂÂ 0xa98559da
>>> Â @@ -48,7 +52,7 @@
>>> Â Â #define DWDSTÂÂÂÂÂÂÂÂÂÂÂ BIT(1)
>>> Â -static int heartbeat;
>>> +static int heartbeat = DEFAULT_HEARTBEAT;
>>> Â Â /*
>>> ÂÂ * struct to hold data for each WDT device
>>> @@ -79,11 +83,9 @@ static int rti_wdt_start(struct watchdog_device *wdd)
>>> ÂÂÂÂÂÂ * be petted during the open window; not too early or not too late.
>>> ÂÂÂÂÂÂ * The HW configuration options only allow for the open window size
>>> ÂÂÂÂÂÂ * to be 50% or less than that; we obviouly want to configure the open
>>> -ÂÂÂÂ * window as large as possible so we select the 50% option. To avoid
>>> -ÂÂÂÂ * any glitches, we accommodate 5% safety margin also, so we setup
>>> -ÂÂÂÂ * the min_hw_hearbeat at 55% of the timeout period.
>>> +ÂÂÂÂ * window as large as possible so we select the 50% option.
>>> ÂÂÂÂÂÂ */
>>> -ÂÂÂ wdd->min_hw_heartbeat_ms = 11 * wdd->timeout * 1000 / 20;
>>> +ÂÂÂ wdd->min_hw_heartbeat_ms = 500 * wdd->timeout;
>>> Â ÂÂÂÂÂ /* Generate NMI when wdt expires */
>>> ÂÂÂÂÂ writel_relaxed(RTIWWDRX_NMI, wdt->base + RTIWWDRXCTRL);
>>> @@ -110,7 +112,48 @@ static int rti_wdt_ping(struct watchdog_device *wdd)
>>> ÂÂÂÂÂ return 0;
>>> Â }
>>> Â -static unsigned int rti_wdt_get_timeleft(struct watchdog_device *wdd)
>>> +static int rti_wdt_setup_hw_hb(struct watchdog_device *wdd, u32 wsize)
>>> +{
>>> +ÂÂÂ /*
>>> +ÂÂÂÂ * RTI only supports a windowed mode, where the watchdog can only
>>> +ÂÂÂÂ * be petted during the open window; not too early or not too late.
>>> +ÂÂÂÂ * The HW configuration options only allow for the open window size
>>> +ÂÂÂÂ * to be 50% or less than that.
>>> +ÂÂÂÂ */
>>> +ÂÂÂ switch (wsize) {
>>> +ÂÂÂ case RTIWWDSIZE_50P:
>>> +ÂÂÂÂÂÂÂ /* 50% open window => 50% min heartbeat */
>>> +ÂÂÂÂÂÂÂ wdd->min_hw_heartbeat_ms = 500 * heartbeat;
>>> +ÂÂÂÂÂÂÂ break;
>>> +
>>> +ÂÂÂ case RTIWWDSIZE_25P:
>>> +ÂÂÂÂÂÂÂ /* 25% open window => 75% min heartbeat */
>>> +ÂÂÂÂÂÂÂ wdd->min_hw_heartbeat_ms = 750 * heartbeat;
>>> +ÂÂÂÂÂÂÂ break;
>>> +
>>> +ÂÂÂ case RTIWWDSIZE_12P5:
>>> +ÂÂÂÂÂÂÂ /* 12.5% open window => 87.5% min heartbeat */
>>> +ÂÂÂÂÂÂÂ wdd->min_hw_heartbeat_ms = 875 * heartbeat;
>>> +ÂÂÂÂÂÂÂ break;
>>> +
>>> +ÂÂÂ case RTIWWDSIZE_6P25:
>>> +ÂÂÂÂÂÂÂ /* 6.5% open window => 93.5% min heartbeat */
>>> +ÂÂÂÂÂÂÂ wdd->min_hw_heartbeat_ms = 935 * heartbeat;
>>> +ÂÂÂÂÂÂÂ break;
>>> +
>>> +ÂÂÂ case RTIWWDSIZE_3P125:
>>> +ÂÂÂÂÂÂÂ /* 3.125% open window => 96.9% min heartbeat */
>>> +ÂÂÂÂÂÂÂ wdd->min_hw_heartbeat_ms = 969 * heartbeat;
>>> +ÂÂÂÂÂÂÂ break;
>>> +
>>> +ÂÂÂ default:
>>> +ÂÂÂÂÂÂÂ return -EINVAL;
>>> +ÂÂÂ }
>>> +
>>> +ÂÂÂ return 0;
>>> +}
>>> +
>>> +static unsigned int rti_wdt_get_timeleft_ms(struct watchdog_device *wdd)
>>> Â {
>>> ÂÂÂÂÂ u64 timer_counter;
>>> ÂÂÂÂÂ u32 val;
>>> @@ -123,11 +166,18 @@ static unsigned int rti_wdt_get_timeleft(struct watchdog_device *wdd)
>>> Â ÂÂÂÂÂ timer_counter = readl_relaxed(wdt->base + RTIDWDCNTR);
>>> Â +ÂÂÂ timer_counter *= 1000;
>>> +
>>> ÂÂÂÂÂ do_div(timer_counter, wdt->freq);
>>> Â ÂÂÂÂÂ return timer_counter;
>>> Â }
>>> Â +static unsigned int rti_wdt_get_timeleft(struct watchdog_device *wdd)
>>> +{
>>> +ÂÂÂ return rti_wdt_get_timeleft_ms(wdd) / 1000;
>>> +}
>>> +
>>> Â static const struct watchdog_info rti_wdt_info = {
>>> ÂÂÂÂÂ .options = WDIOF_KEEPALIVEPING,
>>> ÂÂÂÂÂ .identity = "K3 RTI Watchdog",
>>> @@ -148,6 +198,7 @@ static int rti_wdt_probe(struct platform_device *pdev)
>>> ÂÂÂÂÂ struct watchdog_device *wdd;
>>> ÂÂÂÂÂ struct rti_wdt_device *wdt;
>>> ÂÂÂÂÂ struct clk *clk;
>>> +ÂÂÂ u32 last_ping = 0;
>>> Â ÂÂÂÂÂ wdt = devm_kzalloc(dev, sizeof(*wdt), GFP_KERNEL);
>>> ÂÂÂÂÂ if (!wdt)
>>> @@ -169,6 +220,14 @@ static int rti_wdt_probe(struct platform_device *pdev)
>>> ÂÂÂÂÂÂÂÂÂ return -EINVAL;
>>> ÂÂÂÂÂ }
>>> Â +ÂÂÂ /*
>>> +ÂÂÂÂ * If watchdog is running at 32k clock, it is not accurate.
>>> +ÂÂÂÂ * Adjust frequency down in this case so that we don't pet
>>> +ÂÂÂÂ * the watchdog too often.
>>> +ÂÂÂÂ */
>>> +ÂÂÂ if (wdt->freq < 32768)
>>> +ÂÂÂÂÂÂÂ wdt->freq = wdt->freq * 9 / 10;
>>> +
>>
>> So this is now only a problem is the window size was set restrictively
>> in the BOS/ROMMON. Meaning it is still a problem. How is this better than
>> before ? Why not just always set the window size to something reasonable ?
>
> Re-programming of the watchdog only takes effect at the next ping, then and only then will it adjust the window size + timeout period.
>

What a mess. I am glad this isn't hardware I have to deal with.

>>
>>
>>> ÂÂÂÂÂ pm_runtime_enable(dev);
>>> ÂÂÂÂÂ ret = pm_runtime_get_sync(dev);
>>> ÂÂÂÂÂ if (ret) {
>>> @@ -185,11 +244,8 @@ static int rti_wdt_probe(struct platform_device *pdev)
>>> ÂÂÂÂÂ wdd->min_timeout = 1;
>>> ÂÂÂÂÂ wdd->max_hw_heartbeat_ms = (WDT_PRELOAD_MAX << WDT_PRELOAD_SHIFT) /
>>> ÂÂÂÂÂÂÂÂÂ wdt->freq * 1000;
>>> -ÂÂÂ wdd->timeout = DEFAULT_HEARTBEAT;
>>> ÂÂÂÂÂ wdd->parent = dev;
>>> Â -ÂÂÂ watchdog_init_timeout(wdd, heartbeat, dev);
>>> -
>>> ÂÂÂÂÂ watchdog_set_drvdata(wdd, wdt);
>>> ÂÂÂÂÂ watchdog_set_nowayout(wdd, 1);
>>> ÂÂÂÂÂ watchdog_set_restart_priority(wdd, 128);
>>> @@ -201,12 +257,47 @@ static int rti_wdt_probe(struct platform_device *pdev)
>>> ÂÂÂÂÂÂÂÂÂ goto err_iomap;
>>> ÂÂÂÂÂ }
>>> Â +ÂÂÂ if (readl(wdt->base + RTIDWDCTRL) == WDENABLE_KEY) {
>>> +ÂÂÂÂÂÂÂ u32 time_left_ms;
>>> +ÂÂÂÂÂÂÂ u64 heartbeat_ms;
>>> +ÂÂÂÂÂÂÂ u32 wsize;
>>> +
>>> +ÂÂÂÂÂÂÂ set_bit(WDOG_HW_RUNNING, &wdd->status);
>>> +ÂÂÂÂÂÂÂ time_left_ms = rti_wdt_get_timeleft_ms(wdd);
>>> +ÂÂÂÂÂÂÂ heartbeat_ms = readl(wdt->base + RTIDWDPRLD);
>>> +ÂÂÂÂÂÂÂ heartbeat_ms <<= WDT_PRELOAD_SHIFT;
>>> +ÂÂÂÂÂÂÂ heartbeat_ms *= 1000;
>>> +ÂÂÂÂÂÂÂ heartbeat_ms /= wdt->freq;
>>
>> Ah yes, the pitfalls of 64-bit divide operations.
>
> Should I convert this to use do_div? With 64bit archs this code is targeted at it runs just fine.
>

... and test builds on 32 bit architectures fail to compile, as reported
by 0-day. Maybe you don't care, fine, but then please remove all use of do_div
or other 64-bit divide functions from the driver and mark it as depending
on 64 bit.

>>
>>> +ÂÂÂÂÂÂÂ if (heartbeat_ms / 1000 != heartbeat)
>>> +ÂÂÂÂÂÂÂÂÂÂÂ dev_warn(dev, "watchdog already running, ignoring heartbeat config!\n");
>>> +
>>> +ÂÂÂÂÂÂÂ heartbeat = heartbeat_ms / 1000;
>>> +
>>> +ÂÂÂÂÂÂÂ wsize = readl(wdt->base + RTIWWDSIZECTRL);
>>> +ÂÂÂÂÂÂÂ ret = rti_wdt_setup_hw_hb(wdd, wsize);
>>> +ÂÂÂÂÂÂÂ if (ret) {
>>> +ÂÂÂÂÂÂÂÂÂÂÂ dev_err(dev, "bad window size.\n");

Maybe report what that bad window size actually is ?

>>> +ÂÂÂÂÂÂÂÂÂÂÂ goto err_iomap;
>>> +ÂÂÂÂÂÂÂ }
>>> +
>>> +ÂÂÂÂÂÂÂ last_ping = heartbeat_ms - time_left_ms;
>>> +ÂÂÂÂÂÂÂ if (time_left_ms > heartbeat_ms) {
>>> +ÂÂÂÂÂÂÂÂÂÂÂ dev_warn(dev, "time_left > heartbeat? Assuming last ping just before now.\n");
>>> +ÂÂÂÂÂÂÂÂÂÂÂ last_ping = 0;
>>> +ÂÂÂÂÂÂÂ }
>>
>> All that complexity makes me wonder if it wouldn't be better to just reprogram
>> the watchdog if it is already running. Any reason for not doing that ?
>
> Well, you can re-program it but... It does not take effect until next window starts, so basically we need to take care of the currently running window anyways after which re-programming it doesn't make much sense anymore. And handling the switch from one setup to next would add extra complexity.
>

Seems to me that hardware team really made an effort to make the
watchdog as difficult to program as possible :-(.

> -Tero
>
>>
>>> +ÂÂÂ }
>>> +
>>> +ÂÂÂ watchdog_init_timeout(wdd, heartbeat, dev);
>>> +
>>> ÂÂÂÂÂ ret = watchdog_register_device(wdd);
>>> ÂÂÂÂÂ if (ret) {
>>> ÂÂÂÂÂÂÂÂÂ dev_err(dev, "cannot register watchdog device\n");
>>> ÂÂÂÂÂÂÂÂÂ goto err_iomap;
>>> ÂÂÂÂÂ }
>>> Â +ÂÂÂ if (last_ping)
>>> +ÂÂÂÂÂÂÂ watchdog_set_last_hw_keepalive(wdd, last_ping);
>>> +
>>> ÂÂÂÂÂ return 0;
>>> Â Â err_iomap:
>>>
>>
>
> --
> Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki