Re: [PATCH 04/35] writeback: reduce per-bdi dirty threshold rampup time

From: Wu Fengguang
Date: Tue Dec 14 2010 - 10:15:29 EST


On Tue, Dec 14, 2010 at 10:50:55PM +0800, Peter Zijlstra wrote:
> On Tue, 2010-12-14 at 22:39 +0800, Wu Fengguang wrote:
> > On Tue, Dec 14, 2010 at 10:33:25PM +0800, Wu Fengguang wrote:
> > > On Tue, Dec 14, 2010 at 09:59:10PM +0800, Wu Fengguang wrote:
> > > > On Tue, Dec 14, 2010 at 09:37:34PM +0800, Richard Kennedy wrote:
> > >
> > > > > As to the ramp up time, when writing to 2 disks at the same time I see
> > > > > the per_bdi_threshold taking up to 20 seconds to converge on a steady
> > > > > value after one of the write stops. So I think this could be speeded up
> > > > > even more, at least on my setup.
> > > >
> > > > I have the roughly same ramp up time on the 1-disk 3GB mem test:
> > > >
> > > > http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/tests/3G/ext4-1dd-1M-8p-2952M-2.6.37-rc5+-2010-12-09-00-37/dirty-pages.png
> > > >
> > >
> > > Interestingly, the above graph shows that after about 10s fast ramp
> > > up, there is another 20s slow ramp down. It's obviously due the
> > > decline of global limit:
> > >
> > > http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/tests/3G/ext4-1dd-1M-8p-2952M-2.6.37-rc5+-2010-12-09-00-37/vmstat-dirty.png
> > >
> > > But why is the global limit declining? The following log shows that
> > > nr_file_pages keeps growing and goes stable after 75 seconds (so long
> > > time!). In the same period nr_free_pages goes slowly down to its
> > > stable value. Given that the global limit is mainly derived from
> > > nr_free_pages+nr_file_pages (I disabled swap), something must be
> > > slowly eating memory until 75 ms. Maybe the tracing ring buffers?
> > >
> > > free file reclaimable pages
> > > 50s 369324 + 318760 => 688084
> > > 60s 235989 + 448096 => 684085
> > >
> > > http://www.kernel.org/pub/linux/kernel/people/wfg/writeback/tests/3G/ext4-1dd-1M-8p-2952M-2.6.37-rc5+-2010-12-09-00-37/vmstat
> >
> > The log shows that ~64MB reclaimable memory is stoled. But the trace
> > data only takes 1.8MB. Hmm..
>
> Also, trace buffers are fully pre-allocated.
>
> Inodes perhaps?

Just figured out that it's the buffer heads :)

The other interesting question is, why it takes up to 50s to consume
all the nr_free_pages pages. I would imagine the free pages be quickly
allocated to the page cache..

Attached is the graph for ext2-1dd-1M-8p-2952M-2.6.37-rc5+-2010-12-09-01-36

Thanks,
Fengguang

Attachment: vmstat-reclaimable-500.png
Description: PNG image