Re: [PATCH v3 1/2] writeback: add dirty_background_centisecs per bdi variable

From: Namjae Jeon
Date: Tue Sep 25 2012 - 02:50:20 EST


2012/9/25, Jan Kara <jack@xxxxxxx>:
> On Thu 20-09-12 16:44:22, Wu Fengguang wrote:
>> On Sun, Sep 16, 2012 at 08:25:42AM -0400, Namjae Jeon wrote:
>> > From: Namjae Jeon <namjae.jeon@xxxxxxxxxxx>
>> >
>> > This patch is based on suggestion by Wu Fengguang:
>> > https://lkml.org/lkml/2011/8/19/19
>> >
>> > kernel has mechanism to do writeback as per dirty_ratio and
>> > dirty_background
>> > ratio. It also maintains per task dirty rate limit to keep balance of
>> > dirty pages at any given instance by doing bdi bandwidth estimation.
>> >
>> > Kernel also has max_ratio/min_ratio tunables to specify percentage of
>> > writecache to control per bdi dirty limits and task throttling.
>> >
>> > However, there might be a usecase where user wants a per bdi writeback
>> > tuning
>> > parameter to flush dirty data once per bdi dirty data reach a threshold
>> > especially at NFS server.
>> >
>> > dirty_background_centisecs provides an interface where user can tune
>> > background writeback start threshold using
>> > /sys/block/sda/bdi/dirty_background_centisecs
>> >
>> > dirty_background_centisecs is used alongwith average bdi write
>> > bandwidth
>> > estimation to start background writeback.
> The functionality you describe, i.e. start flushing bdi when there's
> reasonable amount of dirty data on it, looks sensible and useful. However
> I'm not so sure whether the interface you propose is the right one.
> Traditionally, we allow user to set amount of dirty data (either in bytes
> or percentage of memory) when background writeback should start. You
> propose setting the amount of data in centisecs-to-write. Why that
> difference? Also this interface ties our throughput estimation code (which
> is an implementation detail of current dirty throttling) with the userspace
> API. So we'd have to maintain the estimation code forever, possibly also
> face problems when we change the estimation code (and thus estimates in
> some cases) and users will complain that the values they set originally no
> longer work as they used to.
>
> Also, as with each knob, there's a problem how to properly set its value?
> Most admins won't know about the knob and so won't touch it. Others might
> know about the knob but will have hard time figuring out what value should
> they set. So if there's a new knob, it should have a sensible initial
> value. And since this feature looks like a useful one, it shouldn't be
> zero.
>
> So my personal preference would be to have bdi->dirty_background_ratio and
> bdi->dirty_background_bytes and start background writeback whenever
> one of global background limit and per-bdi background limit is exceeded. I
> think this interface will do the job as well and it's easier to maintain in
> future.
Hi Jan.
Thanks for review and your opinion.

Hi. Wu.
How about adding per-bdi - bdi->dirty_background_ratio and
bdi->dirty_background_bytes interface as suggested by Jan?

Thanks.
>
> Honza
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/