Re: Faulty commit "watchdog: iTCO_wdt: Account for rebooting on second timeout"

From: Jan Kiszka
Date: Tue Aug 03 2021 - 11:01:28 EST


On 03.08.21 16:59, Jan Kiszka wrote:
> On 03.08.21 16:51, Jean Delvare wrote:
>> Hi all,
>>
>> Commit cb011044e34c ("watchdog: iTCO_wdt: Account for rebooting on
>> second timeout") causes a regression on several systems. Symptoms are:
>> system reboots automatically after a short period of time if watchdog
>> is enabled (by systemd for example). This has been reported in bugzilla:
>>
>> https://bugzilla.kernel.org/show_bug.cgi?id=213809
>>
>> Unfortunately this commit was backported to all stable kernel branches
>> (4.14, 4.19, 5.4, 5.10, 5.12 and 5.13). I'm not sure why that is the
>> case, BTW, as there is no Fixes tag and no Cc to stable@vger either.
>> And the fix is not trivial, has apparently not seen enough testing,
>> and addresses a problem that has a known and simple workaround. IMHO it
>> should never have been accepted as a stable patch in the first place.
>> Especially when the previous attempt to fix this issue already ended
>> with a regression and a revert.
>>
>> Anyway... After a glance at the patch, I see what looks like a nice
>> thinko:
>>
>> + if (p->smi_res &&
>> + (SMI_EN(p) & (TCO_EN | GBL_SMI_EN)) != (TCO_EN | GBL_SMI_EN))
>>
>> The author most certainly meant inl(SMI_EN(p)) (the register's value)
>> and not SMI_EN(p) (the register's address).
>>
>
> https://lkml.org/lkml/2021/7/26/349
>

That's for the fix (in line with your analysis).

I was also wondering if backporting that quickly was needed. Didn't
propose it, though.

Jan

--
Siemens AG, T RDA IOT
Corporate Competence Center Embedded Linux