Hi Timo,
On 22 May 2015 at 16:59, Timo Kokkonen <timo.kokkonen@xxxxxxxxxx> wrote:
On 22.05.2015 11:23, Fu Wei wrote:
Hi Timo,
On 22 May 2015 at 14:30, Timo Kokkonen <timo.kokkonen@xxxxxxxxxx> wrote:
On 21.05.2015 11:32, fu.wei@xxxxxxxxxx wrote:
From: Fu Wei <fu.wei@xxxxxxxxxx>
Also update Documentation/watchdog/watchdog-kernel-api.txt to
introduce:
(1)the new elements in the watchdog_device and watchdog_ops struct;
(2)the new API "watchdog_init_timeouts".
Reasons:
(1)kernel already has two watchdog drivers are using "pretimeout":
drivers/char/ipmi/ipmi_watchdog.c
drivers/watchdog/kempld_wdt.c(but the definition is different)
(2)some other dirvers are going to use this: ARM SBSA Generic Watchdog
Hi,
As I was proposing some other API changes with my early-timeout-sec work,
I
can see my work is going to collide with your API change proposal a bit.
So
maybe I should ask your opinion as well..
Thank you for reminding me of that, I saw "early-timeout-sec", but
because I don't get it in kernel API or watchdog_core.c, so I did not
pay attention to it.
Sorry.
Could this pretimeout feature be something that other drivers could
benefit
too?
yes , as you may know, I am making a patch for pretimeout support in
watchdog framework
I can see that it does not do anything else except call panic() before
letting the watchdog expire. This is something that could be emulated
easily
by the watchdog core for drivers that don't know anything about
pretimeouts
at all.
For SBSA watchdog, there are two stage timeout, and according to kernel
doc:
----------------------
Pretimeouts:
Some watchdog timers can be set to have a trigger go off before the
actual time they will reset the system. This can be done with an NMI,
interrupt, or other mechanism. This allows Linux to record useful
information (like panic information and kernel coredumps) before it
resets.
----------------------
I think panic() is the right way to do now. but people may also need
to config :
PANIC_TIMEOUT [!=0]
KEXEC
KDUMP
for debug reason
Yes, so basically if we hit pretimeout, we probably have already crashed.
yes, for now , I only use panic(), but at the beginning, I offer
user some options:
https://lists.linaro.org/pipermail/linaro-acpi/2015-April/004350.html
The only difference is that we now have some seconds time to dump out useful
data and then either reboot or let the actual watchdog reset take care of
resetting the device. panic() sounds like a good thing to do. Maybe you
could also dump registers on other CPUs too or try to get out some other
useful information about the kernel in case it has crashed or something. But
I'm just guessing.
yes, that is my thought.
because there is the kdump support in panic(), so that can give use a
chance to figure out why the watchdog wasn't fed.
The way I was planning the API change there would need to be a small
change
with each watchdog driver in order to let the watchdog core take over
generic behaviour on behalf of the driver. My goal was to make the change
so
that each driver that gets converted to the new API extensions gets a
support for early-timeout-sec for free, without needing to enable support
for it any way. If the API was designed properly, also pretimeouts could
be
handled easily and maybe even so that other drivers could have that
feature
even though their hardware does not explicitly give any support for it.
could you please point out your patch , then I can learn your idea :-)
For now , I am working on a common "Pretimeouts" following the concept
in kernel doc.
Any thoughts?
my thoughts is in my pretimeout patch , would you provide some suggestion
?
Here is an archive link to my patch set:
http://www.spinics.net/lists/linux-watchdog/msg06473.html
Ah , cool, I can see some common in your patch. Thanks :-)
See if I can learn something from your patches
Your patch set is adding a new call to the core for adjusting the
pretimoeut, which is probably a good thing in case the driver needs special
handling for it anyway. But if the core had capability to emulate pretimeout
for drivers that can't support it in hardware, it would be good if there was
a way for the core to support it even though the driver had zero code for
it. The core also has watchdog_init_timeout() right now but even that is not
called from that many drivers. I would like to fix that too so that drivers
would not need to bother so much about it but let core take care of it more.
This is why I proposed the watchdog_init_params() call that could dig all
the generic parameters from device tree on behalf of the driver. This is
where I though timeout and early-timeout-sec parameters would be parsed from
device tree, but it could also parse pretimeout parameter as well.
Apparently Guenter did not like my approach so we might need some other way
to do it.
I am using pretimeout, because SBSA watchdog hardware support two
stages timeout,
I am reusing the existing kernel concept, but your early_timeout_sec
is a new concept.
If you check my previous patchset , you will see : at the beginning,
pretimeout support is not a generic features in my patchset.
Then Arnd suggest that I can try to make pretimeout into watchdog
framework, and Guenter said :"Makes sense."
So I am still trying to improve pretimeout support :-)
If I can make pretimeout merged, may be you can try pretimeout to
implement early_timeout_sec function?
It is up to the maintainers, I will try my best.