Re: [PATCH] hangcheck-timer is broken on x86

From: Yury Polyanskiy
Date: Fri Mar 26 2010 - 17:52:43 EST


On Fri, 26 Mar 2010 14:24:23 -0700
Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:

> On Tue, 23 Mar 2010 23:36:11 -0400
> Yury Polyanskiy <ypolyans@xxxxxxxxxxxxx> wrote:
>
> > The drivers/char/hangcheck-timer.c is doubly broken. First, the
> > following line overflows unsigned long:
> > # define TIMER_FREQ (HZ*loops_per_jiffy)
> >
> > Second, and more importantly, loops_per_jiffy has little to do with the conversion from the
> > the time scale of get_cycles() (aka rdtsc) to the time scale of jiffies.
>
> It's a bit odd to have a driver be this broken on x86_32 for five years
> without anyone noticing. What are the user-visible effects of these
> shortcomings?

When the overflown value of TIMER_FREQ is abnormally low, it spams the
syslog with KERN_CRIT messages "Hangcheck: hangcheck value past margin!"
But whether it happens or not depends on HZ and lpj in a complex way.
People have hit it occasionally as far as google search can tell.

>
> Also, please do send us a Signed-off-by: for this patch, as explained
> in Documentation/SubmittingPatches, thanks.
>

Sorry.

Signed-off-by: Yury Polyanskiy <polyanskiy@xxxxxxxxx>

Thank you Andrew!

Yury

Attachment: signature.asc
Description: PGP signature