Re: [PATCH v3 4/6] watchdog: add watchdog pretimeout framework

From: Vladimir Zapolskiy
Date: Wed Jun 08 2016 - 09:38:04 EST


Hi Guenter,

On 08.06.2016 00:43, Guenter Roeck wrote:
> On Tue, Jun 07, 2016 at 08:38:45PM +0300, Vladimir Zapolskiy wrote:
>> The change adds a simple watchdog pretimeout framework infrastructure,
>> its purpose is to allow users to select a desired handling of watchdog
>> pretimeout events, which may be generated by some watchdog devices.
>>
>> A user selects a default watchdog pretimeout governor during
>> compilation stage.
>>
>> Watchdogs with WDIOF_PRETIMEOUT capability now have two device
>> attributes in sysfs: pretimeout to display currently set pretimeout
>> value and pretimeout_governor attribute to display the selected
>> watchdog pretimeout governor.
>>
>> Watchdogs with no WDIOF_PRETIMEOUT capability has no changes in
>> sysfs, and such watchdog devices do not require the framework.
>>
>
> One might read "require" as saying that it is mandatory for drivers
> with pretimeout support. That is not what you mean, presumably ?
>

Well, I expressed a negation, formally it does not contradict to the
fact that a WDIOF_PRETIMEOUT capable device/driver can exist without
this code.

If a board has watchdog devices/drivers without WDIOF_PRETIMEOUT
capability (at the moment only kempld_wdt.c explicitly sets it)
the whole framework is dead code, though "do not require" may be too
strict.

But I agree, there is a room for ambiguity, I'll reword the paragraph.

>> Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@xxxxxxxxxx>
>> ---
>> Changes from v2 to v3:
>> * essentially simplified the implementation due to removal of runtime
>> dynamic selection of watchdog pretimeout governors by a user, this
>> feature is supposed to be added later on
>> * removed support of sleepable watchdog pretimeout governors
>> * moved sysfs device attributes to watchdog_dev.c, this required to
>> add exported watchdog_pretimeout_governor_name() interface
>> * if pretimeout framework is not selected, then pretimeout event
>> ends up in kernel panic -- this behaviour as a default one was asked
>> by Guenter
>>
>> Changes from v1 to v2:
>> * removed framework private bits from struct watchdog_governor,
>> * centralized compile-time selection of a default governor in
>> watchdog_pretimeout.h,
>> * added can_sleep option, now only sleeping governors (e.g. userspace)
>> will be executed in a special workqueue,
>> * changed fallback logic, if a governor in use is removed, now this
>> situation is not possible, because in use governors have non-zero
>> module refcount
>>
>> drivers/watchdog/Kconfig | 8 ++
>> drivers/watchdog/Makefile | 5 +-
>> drivers/watchdog/watchdog_core.c | 9 +++
>> drivers/watchdog/watchdog_dev.c | 16 ++++
>> drivers/watchdog/watchdog_pretimeout.c | 134 +++++++++++++++++++++++++++++++++
>> drivers/watchdog/watchdog_pretimeout.h | 42 +++++++++++
>> include/linux/watchdog.h | 13 ++++
>> 7 files changed, 226 insertions(+), 1 deletion(-)
>> create mode 100644 drivers/watchdog/watchdog_pretimeout.c
>> create mode 100644 drivers/watchdog/watchdog_pretimeout.h
>>
> Please document the new API functions.
>

Ok.

>> diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
>> index b54f26c..354217e 100644
>> --- a/drivers/watchdog/Kconfig
>> +++ b/drivers/watchdog/Kconfig
>> @@ -1800,4 +1800,12 @@ config USBPCWATCHDOG
>>
>> Most people will say N.
>>
>> +comment "Watchdog Pretimeout Governors"
>> +
>> +config WATCHDOG_PRETIMEOUT_GOV
>> + bool "Enable watchdog pretimeout governors"
>> + default n
>
> I don't think 'default n" is needed.
>

No strict objections, but probably 'default n' may save quite many
lines in defconfigs.

>> + help
>> + The option allows to select watchdog pretimeout governors.
>> +
>> endif # WATCHDOG
>> diff --git a/drivers/watchdog/Makefile b/drivers/watchdog/Makefile
>> index a46e7c1..cca47de 100644
>> --- a/drivers/watchdog/Makefile
>> +++ b/drivers/watchdog/Makefile
>> @@ -3,9 +3,12 @@
>> #
>>
>> # The WatchDog Timer Driver Core.
>> -watchdog-objs += watchdog_core.o watchdog_dev.o
>> obj-$(CONFIG_WATCHDOG_CORE) += watchdog.o
>>
>> +watchdog-objs += watchdog_core.o watchdog_dev.o
>> +
>> +watchdog-$(CONFIG_WATCHDOG_PRETIMEOUT_GOV) += watchdog_pretimeout.o
>> +
>> # Only one watchdog can succeed. We probe the ISA/PCI/USB based
>> # watchdog-cards first, then the architecture specific watchdog
>> # drivers and then the architecture independent "softdog" driver.
>> diff --git a/drivers/watchdog/watchdog_core.c b/drivers/watchdog/watchdog_core.c
>> index 7c3ba58..ae6c23a 100644
>> --- a/drivers/watchdog/watchdog_core.c
>> +++ b/drivers/watchdog/watchdog_core.c
>> @@ -40,6 +40,7 @@
>> #include <linux/of.h> /* For of_get_timeout_sec */
>>
>> #include "watchdog_core.h" /* For watchdog_dev_register/... */
>> +#include "watchdog_pretimeout.h"
>>
>> static DEFINE_IDA(watchdog_ida);
>>
>> @@ -244,6 +245,13 @@ static int __watchdog_register_device(struct watchdog_device *wdd)
>> }
>> }
>>
>> + ret = watchdog_register_pretimeout(wdd);
>> + if (ret) {
>> + watchdog_dev_unregister(wdd);
>> + ida_simple_remove(&watchdog_ida, wdd->id);
>> + return ret;
>> + }
>> +
>> if (test_bit(WDOG_STOP_ON_REBOOT, &wdd->status)) {
>> wdd->reboot_nb.notifier_call = watchdog_reboot_notifier;
>>
>> @@ -251,6 +259,7 @@ static int __watchdog_register_device(struct watchdog_device *wdd)
>> if (ret) {
>> pr_err("watchdog%d: Cannot register reboot notifier (%d)\n",
>> wdd->id, ret);
>> + watchdog_unregister_pretimeout(wdd);
>> watchdog_dev_unregister(wdd);
>> ida_simple_remove(&watchdog_ida, wdd->id);
>> return ret;
>
> A bit at loss here. Why is it not necessary to unregister the pretimeout
> from __watchdog_unregister_device() ? Doesn't this result in a stale pointer
> and a memory leak ?

This is a bug you found, thank you.

>> diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
>> index 87bbae7..c0fd743 100644
>> --- a/drivers/watchdog/watchdog_dev.c
>> +++ b/drivers/watchdog/watchdog_dev.c
>> @@ -49,6 +49,7 @@
>> #include <linux/uaccess.h> /* For copy_to_user/put_user/... */
>>
>> #include "watchdog_core.h"
>> +#include "watchdog_pretimeout.h"
>>
>> /*
>> * struct watchdog_core_data - watchdog core internal data
>> @@ -452,6 +453,16 @@ static ssize_t pretimeout_show(struct device *dev,
>> }
>> static DEVICE_ATTR_RO(pretimeout);
>>
>> +static ssize_t pretimeout_governor_show(struct device *dev,
>> + struct device_attribute *devattr,
>> + char *buf)
>> +{
>> + struct watchdog_device *wdd = dev_get_drvdata(dev);
>> +
>> + return watchdog_pretimeout_governor_name(wdd, buf);
>> +}
>> +static DEVICE_ATTR_RO(pretimeout_governor);
>> +
>> static umode_t wdt_is_visible(struct kobject *kobj, struct attribute *attr,
>> int n)
>> {
>> @@ -466,6 +477,10 @@ static umode_t wdt_is_visible(struct kobject *kobj, struct attribute *attr,
>> else if (attr == &dev_attr_pretimeout.attr &&
>> !(wdd->info->options & WDIOF_PRETIMEOUT))
>> mode = 0;
>> + else if (attr == &dev_attr_pretimeout_governor.attr &&
>> + !((wdd->info->options & WDIOF_PRETIMEOUT) &&
>> + IS_ENABLED(CONFIG_WATCHDOG_PRETIMEOUT_GOV)))
>
> This is confusing. How about the following ?
> &&
> (!(wdd->info->options & WDIOF_PRETIMEOUT) ||
> !IS_ENABLED(CONFIG_WATCHDOG_PRETIMEOUT_GOV)))
>
> Sure, it is mathematically the same, but it would be much easier
> to understand.

No objections, originally I intended to preserve the preexisting style of

attr == &dev_attr_XXX.attr && !(some expression on wdd)

>> + mode = 0;
>>
>> return mode;
>> }
>> @@ -478,6 +493,7 @@ static struct attribute *wdt_attrs[] = {
>> &dev_attr_status.attr,
>> &dev_attr_nowayout.attr,
>> &dev_attr_pretimeout.attr,
>> + &dev_attr_pretimeout_governor.attr,
>> NULL,
>> };
>>
>> diff --git a/drivers/watchdog/watchdog_pretimeout.c b/drivers/watchdog/watchdog_pretimeout.c
>> new file mode 100644
>> index 0000000..ebfc3d6
>> --- /dev/null
>> +++ b/drivers/watchdog/watchdog_pretimeout.c
>> @@ -0,0 +1,134 @@
>> +/*
>> + * Copyright (C) 2015-2016 Mentor Graphics
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> + *
>> + */
>> +
>> +#include <linux/list.h>
>> +#include <linux/slab.h>
>> +#include <linux/spinlock.h>
>> +#include <linux/watchdog.h>
>> +
>> +#include "watchdog_pretimeout.h"
>> +
>> +/* Default watchdog pretimeout governor */
>> +static struct watchdog_governor *default_gov;
>> +
>> +/* The spinlock protects wdd->gov and pretimeout_list */
>> +static DEFINE_SPINLOCK(pretimeout_lock);
>> +
>> +/* List of watchdog devices, which can generate a pretimeout event */
>> +static LIST_HEAD(pretimeout_list);
>> +
>> +struct watchdog_pretimeout {
>> + struct watchdog_device *wdd;
>> + struct list_head entry;
>> +};
>> +
>> +int watchdog_pretimeout_governor_name(struct watchdog_device *wdd, char *buf)
>> +{
>> + int count = 0;
>> +
> Initialization seems unnecessary.
>

Correct, will remove it.

>> + spin_lock_irq(&pretimeout_lock);
>> + if (wdd->gov)
>> + count = sprintf(buf, "%s\n", wdd->gov->name);
>> + else
>> + count = sprintf(buf, "N/A\n");
>
> Why not just return an empty string ?
>

This is a matter of interface, whatever clearer for a user is
preferred. I don't have objections against returning an empty string,
I will change it in v4.

>> + spin_unlock_irq(&pretimeout_lock);
>> +
>> + return count;
>> +}
>> +
>> +void watchdog_notify_pretimeout(struct watchdog_device *wdd)
>> +{
>> + unsigned long flags;
>> +
>> + if (!wdd)
>> + return;
>> +
> Why would this function ever be called with NULL wdd ?
>

No reason for that, I believe Postel's law is not applicable here...

>> + spin_lock_irqsave(&pretimeout_lock, flags);
>> + if (!wdd->gov) {
>> + spin_unlock_irqrestore(&pretimeout_lock, flags);
>> + return;
>> + }
>> +
>> + wdd->gov->pretimeout(wdd);
>> + spin_unlock_irqrestore(&pretimeout_lock, flags);
>> +}
>> +EXPORT_SYMBOL_GPL(watchdog_notify_pretimeout);
>> +
>> +int watchdog_register_governor(struct watchdog_governor *gov)
>> +{
>> + struct watchdog_pretimeout *p;
>> +
>> + if (!gov || !gov->name || !gov->pretimeout ||
>> + strlen(gov->name) >= WATCHDOG_GOV_NAME_MAXLEN)
>> + return -EINVAL;
>
> We don't usually do API parameter validation like this. Callers
> are expected to write correct code.

Same as above, I will remove the checks.

>> +
>> + if (default_gov)
>> + return -EBUSY;
>> +
>
> What does this mean ? That only one governor will be accepted ?
>
> Maybe an API documentation will help, but I don't really understand the logic
> here. If only one governor can be registered anyway, why bother keeping more
> than one around ?

Same as above.

>> + spin_lock_irq(&pretimeout_lock);
>> + list_for_each_entry(p, &pretimeout_list, entry)
>> + if (!p->wdd->gov)
>> + p->wdd->gov = gov;
>> + spin_unlock_irq(&pretimeout_lock);
>> +
>> + default_gov = gov;
>> +
>> + return 0;
>> +}
>> +
>> +void watchdog_unregister_governor(struct watchdog_governor *gov)
>> +{
>> + if (!gov)
>> + return;
>> +
> Under what circumstances is this expected to happen ?
>

Same as above.

>> + if (default_gov == gov)
>> + default_gov = NULL;
>
> Why bother ? Registration code would not accept a second registration anyway.
>
> Individual drivers may (and will) still reference default_gov.
>
> With the governors all being boolean, the remove function is pretty much
> useless anyway.
>
> I understand that you plan to add features later, and maybe there is a plan
> to correctly handle more than one governor. But it doesn't make sense to start
> with a broken framework; whatever is introduced should work by itself and not
> rely on it being fixed later.

Yes, that's perfectly correct, I totally agree.

>> +}
>> +
>> +int watchdog_register_pretimeout(struct watchdog_device *wdd)
>> +{
>> + struct watchdog_pretimeout *p;
>> +
>> + if (!(wdd->info->options & WDIOF_PRETIMEOUT))
>> + return 0;
>> +
>> + p = kzalloc(sizeof(*p), GFP_KERNEL);
>> + if (!p)
>> + return -ENOMEM;
>> +
>> + spin_lock_irq(&pretimeout_lock);
>> + list_add(&p->entry, &pretimeout_list);
>> + p->wdd = wdd;
>> + wdd->gov = default_gov;
>
> Unless I am missing something, at least per this patch, there is only
> one governor that is ever accepted, which is the first one registered.
> With that, I don't really see the value of keeping a per-device governor
> list around.

This is not a per-device governor list (which actually was present in v2
and removed in v3), the description of this list is found around its
declaration, this is a list of watchdog devices, which can produce
a pretimeout event. Oh, "pretimeout_list" is a confusing naming...

The list can not be safely removed due to the fact that there is no
synchronization between registration of governors and watchdog devices,
at boot time a governor driver may be registered after completed
registration of some watchdog drivers, still these early watchdogs should
get a link to the governor, when it is ready, the list intends to
accumulate all of these registered WDIOF_PRETIMEOUT capable watchdogs.

> After all my comments, I must admit that I am quite at loss, maybe due to
> the lack of documentation. If so, please take my feedback as talking
> points to add to the documentation, to help others understand how this
> framework is supposed to work.
>

Sure, thank you so much for review.

--
Best wishes,
Vladimir