Re: [patch 5/5] clocksource: Rewrite watchdog code completely

From: Thomas Gleixner

Date: Sun Mar 08 2026 - 05:53:46 EST


Daniel!

On Mon, Feb 23 2026 at 14:53, Thomas Gleixner wrote:
> On Sun, Feb 15 2026 at 20:18, Daniel J Blueman wrote:
>> On Mon, 2 Feb 2026 at 19:27, Thomas Gleixner <tglx@xxxxxxxxxx> wrote:
>> Good step forward! We can also reduce remote cacheline invalidation by
>> putting 'seq' into the cacheline after 'cpu_ts' by reordering:
>
> Good point.
>
>> With that said, with your latest change on the 1920 thread setup,
>> WATCHDOG_READOUT_MAX_US 1000 is still needed to avoid timeouts during
>> the previous adverse workload, however some timeouts are still seen
>> during massive parallel process teardowns.
>>
>> To limit overhead, perhaps it is sufficient to set the timeout to
>> 100us, avoid retries (as the hardware thread may continue to be busy
>> and will be rechecked later anyway), and log timeouts at the debug
>> level if at all.
>
> Something like the below should work even with 50us. I left the print at
> INFO level for now. We can either change it to pr_info_once() or to
> debug as you said.

Any chance you can give this a test ride on that 1920 thread
monstrosity?

Thanks,

tglx