Re: [PATCH 1/2] dt-bindings: watchdog: intel: Add YAML Schemas for Watchdog timer

From: Guenter Roeck
Date: Thu Jun 11 2020 - 13:09:54 EST


On Thu, Jun 11, 2020 at 05:38:14PM +0800, Dilip Kota wrote:
>
> On 6/10/2020 9:05 PM, Guenter Roeck wrote:
> > On 6/10/20 12:54 AM, Dilip Kota wrote:
> > > On 6/9/2020 9:46 PM, Guenter Roeck wrote:
> > > > On 6/9/20 1:57 AM, Dilip Kota wrote:
> > > > > On 6/8/2020 9:37 PM, Guenter Roeck wrote:
> > > > > > On 6/7/20 10:49 PM, Dilip Kota wrote:
> [...]
> > > > > > > +
> > > > > > > +description: |
> > > > > > > +  Intel Lightning Mountain SoC has General Purpose Timer Counter(GPTC) which can
> > > > > > > +  be configured as Clocksource, real time clock and Watchdog timer.
> > > > > > > +  Each General Purpose Timer Counter has three timers. And total four General
> > > > > > > +  Purpose Timer Counters are present on Lightning Mountain SoC which sums up
> > > > > > > +  to 12 timers.
> > > > > > > +  Lightning Mountain has four CPUs and each CPU is configured with one GPTC
> > > > > > > +  timer as watchdog timer. Total four timers are configured as watchdog timers
> > > > > > > +  on Lightning Mountain SoC.
> > > > > > > +
> > > > > > Why not just one ? The watchdog subsystem does not monitor individual CPUs,
> > > > > > it monitors the system.
> > > > > Intel Atom based Lightning Mountain SoC, system has four CPUs. On Lightning Mountain SoC ,Watchdog subsystem is combination of GPTC timers and reset controller unit. On Lightning Mountain SoC, each CPU is configured with one GPTC timer, so that if any of the CPU hangs or freezes, the watchdog daemon running on respective CPU cannot reset/ping or pet the watchdog timer. This causes the watchdog timeout. On watchdog timeout, reset controller triggers the reset to respective CPU.
> > > > >
> > > > A system watchdog driver should not duplicate functionality
> > > > from kernel/watchdog.c, which monitors individual CPUs.
> > > > If the SoC does nto provide a system watchdog timer (which
> > > > I think is unlikely), it should stick with that. A watchdog
> > > > resetting an individual CPU instead of the entire system
> > > > isn't something I would want to see in the watchdog subsystem.
> > > My bad here, complete hardware reset happens on watchdog timeout not a single CPU or core.
> > > Could you please clarify: The complete system means, you mean, "a watchdog susbsystem should monitor all the cores/cpus in the SoC. Not like each core/cpu in SoC having a wdt".
> > >
> > No, the watchdog subsystem does not monitor "all cores".
> > Again, that is the responsibility of kernel/watchdog.c.
> I am a bit confused here.
> I have gone through the kernel/watchdog.c code and i see hrtimers are used
> and panic is triggered for lockup on CPU/core.
> It looks similar to the watchdog subsystem which uses wdt and triggers
> hardware reset on timeout, whereas kernel/watchdog.c using hrtimers and
> triggers panic on timeout.
> To my understanding Watchdog timer recovers the hardware from software hangs
> or freeze states on the CPU / cores.
> Also, what does system mean in your statement " watchdog subsystem monitors
> the system"? What all comes under the system other than the cores/cpus.
> And also i see there is no other watchdog subsystem in Lightning Mountain
> architecture.
>

>From my perspective, we are not going to duplicate functionality covered
by kernel/watchdog.c, which means we are not going to support per cpu core
watchdog drivers in drivers/watchdog.

If you insist doing it anyway, please disuss with Wim.

Thanks,
Guenter