Re: [PATCH 8/8] vm: Add an tuning knob for vm.max_writeback_pages

From: Peter Zijlstra
Date: Wed Sep 02 2009 - 03:33:47 EST


On Tue, 2009-09-01 at 16:27 -0400, Theodore Tso wrote:
> On Tue, Sep 01, 2009 at 02:44:55PM -0400, Christoph Hellwig wrote:
> > On Tue, Sep 01, 2009 at 08:38:55PM +0200, Peter Zijlstra wrote:
> > > Do we really need a tunable for this?
> >
> > It will make increasing it in the field a lot easier. And having deal
> > with really large systems I have the fear that there are I/O topologies
> > outhere for which every "reasonable" value is too low.
> >
> > > I guess we need a limit to avoid it writing out everything, but can't we
> > > have something automagic?
> >
> > Some automatic adjustment would be nice. But finding the right auto
> > tuning will be an interesting exercise.
>
> The fact that limit is on a per-inode basis is part of the problem.

I would think that it would be a BDI based property, since it basically
depends on the speed of the backing dev you're writing to.

> Right now, we are only writing out X pages per inode, so depending on
> whether we have one really gargantuan inode that needs writout, or ten
> big inodes which are dirty, or million small inodes, the fact that we
> are imposing a limit based the number of pages in a single inode that
> we will write out seems like the wrong design choice.

Agreed, number of chunks, where a chunk is some optimum write size for
the device in question, and number of seeks, seem a more suitable
criteria.

Basically limiting the time spend on writeout and not much else.

> So perhaps the best argument for not making this be a tunable is that
> in the long run, we will need to put in a better algorithm for
> controlling how much writeback we want to do before we start
> saturating RAID arrays, and in that new algorithm this tunable may no
> longer make sense. Fine; at that point, we can make it go away. For
> now, though, it seems to be the best way to tweak what is going on,
> since I doubt we'll be able to come up with one magic number that will
> satisfy everyone.

Thing is, will this single tunable be sufficient for people who have
both a RAID array and an USB stick on the same machine?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/