Re: High CPU load when machine is idle (related to PROBLEM: Unusually high load average when idle in 2.6.35, 2.6.35.1 and later)
From: tmhikaru
Date: Thu Oct 21 2010 - 21:37:42 EST
On Thu, Oct 21, 2010 at 02:36:21PM -0400, tm@ wrote:
> On Wed, Oct 20, 2010 at 09:48:43PM -0400, tm@ wrote:
> > On Wed, Oct 20, 2010 at 07:26:45PM +0200, Peter Zijlstra wrote:
> > >
> > >
> > > OK, how does this work for people? I find my idle load is still a tad
> > > high, but maybe I'm not patient enough.
> >
> > I haven't had a chance to keep up with the topic, and I apologize. I'll be
> > testing this as soon as I can finish compiling it. Thank you all for not
> > letting this go unfixed.
> >
> > Tim McGrath
>
> Now that I've actually had a chance to boot the kernel with the patch
> applied I'm sorry to say but the load average isn't decaying as fast as it
> ought to, at the very least. My machine's been idle for the last ten minutes
> but the one minute average is still at 0.89 and shooting up to 1.5, the 5
> min average is 0.9, and the 15 min average is .68 and climbing. Even as I'm
> writing this the averages are continuing to drop, but *very* slowly.
> Glacially, almost. The one minute average is continuing to randomly spike
> high for no reason I can tell as well.
>
> I'll let you guys know if this actually bottoms out at some point.
>
> Tim McGrath
It did not. When I came home and checked, my load average was a
steady 0.7-0.8 across the board on all averages with the machine idle since
six hours ago. I guess the patch didn't fix the problem for me. If you want,
I'll try building master/tip with the patch applied, but I doubt it'll
really be different.
On the plus side, the patch did do something - it seems much less
erratic than it used to be for whatever reason, and now just has a very
steady load average rather than jumping about as it does without the patch
applied.
I wish I understood the code enough to know what is going wrong
here. I have to wonder what impact the original bug was causing. It seems to
me that if it only affected a few people it might be worth backing out the
patch and rethinking the problem it was meant to fix. On the other hand, if
there's some way of diagnosing the problem I'm all for it - is there some
kprintfs or something I could put in the code to find out when it's doing
unlikely or 'impossible' things? There is serious weirdness going on here
and I'd like to figure out the cause of it. I get the impression we're all
bumbling about in the dark poking at a gigantic elephant and getting the
wrong impressions.
Tim McGrath
Attachment:
pgp00000.pgp
Description: PGP signature