Re: pdflush eating a lot of CPU on heavy NFS I/O

From: Brent Cook
Date: Thu Apr 29 2004 - 08:17:41 EST


On Wed, 28 Apr 2004, Andrew Morton wrote:

> Brent Cook <busterbcook@xxxxxxxxx> wrote:
> >
> > On Wed, 28 Apr 2004, Andrew Morton wrote:
> >
> > > Brent Cook <busterbcook@xxxxxxxxx> wrote:
> > > >
> > > > sync_sb_inodes: write inode c55d25bc
> > > > __sync_single_inode: writepages in nr_pages:25 nr_to_write:949
> > > > pages_skipped:0 en:0
> > > > __sync_single_inode: writepages in nr_pages:25 nr_to_write:949
> > > > pages_skipped:0 en:0
> > >
> > > uh-huh.
> > >
> > > Does this fix it?
> >
> > I'm going to run a compile/load test overnight, but the test that
> > triggered it every time previously failed to do so with this patch.
>
> OK, thanks. A better patch would be:

No, thank you! The overnight test was successful. I have been running this
better patch for a little while, and it is no worse. I think you have
solved the bigger problem, which was the runaway process, at least for me.

So, moving it to the tail of the s_dirty list now puts that page in a
higher-priority to be written back next time? That sounds better than just
redirtying it; the poor inode has been through enough as it is without
having to wait even longer.

If you want to think about it a little more, pdflush on 2.6.6-rc3 with
this patch still seems to use more resources than it did on 2.6.5. With
heavy NFS traffic, it still uses about 2-3% CPU on 2.6.6-rc3, but on 2.6.5
it averages about 0.1%. Maybe it just wasn't being used to its full
potential in 2.6.5?

Thanks
- Brent

>
> diff -puN fs/fs-writeback.c~writeback-livelock-fix-2 fs/fs-writeback.c
> --- 25/fs/fs-writeback.c~writeback-livelock-fix-2 2004-04-28 21:19:32.779061976 -0700
> +++ 25-akpm/fs/fs-writeback.c 2004-04-28 21:20:11.080239312 -0700
> @@ -176,11 +176,12 @@ __sync_single_inode(struct inode *inode,
> if (wbc->for_kupdate) {
> /*
> * For the kupdate function we leave the inode
> - * where it is on sb_dirty so it will get more
> + * at the head of sb_dirty so it will get more
> * writeout as soon as the queue becomes
> * uncongested.
> */
> inode->i_state |= I_DIRTY_PAGES;
> + list_move_tail(&inode->i_list, &sb->s_dirty);
> } else {
> /*
> * Otherwise fully redirty the inode so that
>
> _
>
> > pdflush is behaving so far, and I'll say you've figured it out for now,
> > with the final verdict in about 8 hours.
> >
> > Does this mean that, if there were too many dirty pages and not enough
> > time to write them all back, that the dirty page list just stopped being
> > traversed, stuck on a single page?
>
> No.. There's all sorts of livelock avoidance code in there and I keep on
> forgetting that sometimes writepage won't write the dang page at all -
> instead it just redirties the page (and hence the inode).
>
> Now, that redirtying of the inode _should_ have moved the inode off the
> s_io list and onto the s_dirty list. But for some reason it looks like it
> didn't, so we get stuck in a loop. I need to think about it a bit more.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/