Re: NFS client growing system CPU

From: Simon Kirby
Date: Wed Sep 28 2011 - 15:58:43 EST


On Tue, Sep 27, 2011 at 01:04:15PM -0400, Trond Myklebust wrote:

> On Tue, 2011-09-27 at 09:49 -0700, Simon Kirby wrote:
> > On Tue, Sep 27, 2011 at 07:42:53AM -0400, Trond Myklebust wrote:
> >
> > > On Mon, 2011-09-26 at 17:39 -0700, Simon Kirby wrote:
> > > > Hello!
> > > >
> > > > Following up on "System CPU increasing on idle 2.6.36", this issue is
> > > > still happening even on 3.1-rc7. So, since it has been 9 months since I
> > > > reported this, I figured I'd bisect this issue. The first bisection ended
> > > > in an IPMI regression that looked like the problem, so I had to start
> > > > again. Eventually, I got commit b80c3cb628f0ebc241b02e38dd028969fb8026a2
> > > > which made it into 2.6.34-rc4.
> > > >
> > > > With this commit, system CPU keeps rising as the log crunch box runs
> > > > (reads log files via NFS and spews out HTML files into NFS-mounted report
> > > > directories). When it finishes the daily run, the system time stays
> > > > non-zero and continues to be higher and higher after each run, until the
> > > > box never completes a run within a day due to all of the wasted cycles.
> > >
> > > So reverting that commit fixes the problem on 3.1-rc7?
> > >
> > > As far as I can see, doing so should be safe thanks to commit
> > > 5547e8aac6f71505d621a612de2fca0dd988b439 (writeback: Update dirty flags
> > > in two steps) which fixes the original problem at the VFS level.
> >
> > Hmm, I went to git revert b80c3cb628f0ebc241b02e38dd028969fb8026a2, but
> > for some reason git left the nfs_mark_request_dirty(req); line in
> > nfs_writepage_setup(), even though the original commit had that. Is that
> > OK or should I remove that as well?
> >
> > Once that is sorted, I'll build it and let it run for a day and let you
> > know. Thanks!
>
> It shouldn't make any difference whether you leave it or remove it. The
> resulting second call to __set_page_dirty_nobuffers() will always be a
> no-op since the page will already be marked as dirty.

Ok, confirmed, git revert b80c3cb628f0ebc241b02e38dd028969fb8026a2 on
3.1-rc7 fixes the problem for me. Does this make sense, then, or do we
need further investigation and/or testing?

Simon-
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/