RE: [PATCH V2] kernel/watchdog: fix spurious hard lockups

From: Thomas Gleixner
Date: Mon Jul 17 2017 - 03:14:40 EST


On Mon, 17 Jul 2017, Liang, Kan wrote:
> There are three proposed patches so far.
> Patch 1: The patch as above which speed up the hrtimer.
> Patch 2: Thomas's first proposal.
> https://patchwork.kernel.org/patch/9803033/
> https://patchwork.kernel.org/patch/9805903/
> Patch 3: my original proposal which increase the NMI watchdog timeout by 3X
> https://patchwork.kernel.org/patch/9802053/
>
> According to our test, only patch 3 works well.
> The other two patches will hang the system eventually.
> For patch 1, the system hang after running our test case for ~1 hour.
> For patch 2, the system hang in running the overnight test.
> There is no error message shown when the system hang. So I don't know the
> root cause yet.

That doesn't make sense. What's the exact test procedure?

> BTW: We set 1 to watchdog_thresh when we did the test.
> It's believed that can speed up the failure.

Believe is not really a technical measure....

Thanks,

tglx