Re: [PATCH 2/2] blk-throttle: Take blkcg->lock while traversingblkcg->policy_list

From: Vivek Goyal
Date: Tue Oct 25 2011 - 15:02:16 EST


On Tue, Oct 25, 2011 at 04:13:11PM +0200, Jens Axboe wrote:
> On 2011-10-21 14:10, Vivek Goyal wrote:
> > On Thu, Oct 20, 2011 at 02:29:58PM -0700, Tejun Heo wrote:
> >> Hello,
> >>
> >> On Thu, Oct 20, 2011 at 05:20:21PM -0400, Vivek Goyal wrote:
> >>> The only problem with this approach is that it will cleanup per device
> >>> weight rules also at elevator_exit() time which is not same as device
> >>> removal and one might device to bring CFQ back on device and we will
> >>> need the rules again.
> >>
> >> I actually think removoing those rules on elevator detach would be the
> >> right thing to do. We don't try to keep cfq setting across elevator
> >> switch. When we're switching from cfq, we're detaching iocg policy
> >> too. The settings going away is perfectly fine. I actually think
> >> it's a pretty bad idea to implement ad-hoc setting persistence in
> >> kernel. Just making sure that userland is notified is far better
> >> approach. Userland has all the facilities to deal with this type of
> >> situations.
> >>
> >> When switching from cfq to deadline, we lose the whole proportional io
> >> control. It's way more confusing to have lingering settings which
> >> don't do anything.
> >
> > I am not so sure about this. Suppose tomorrow another IO sheduler starts
> > taking into account the cgroup gloabl weight or cgroup per device weight
> > to do some kind of IO prioritization, then removing the rules upon
> > changing the IO schduler will not make sense.
> >
> > IOW, rules are per cgroup per device and not per cgroup per IO scheduler
> > and more than one IO scheduler should be able to share the rules.
>
> FWIW, I agree with Tejun here. A switch operation is a reset, start from
> scratch. We don't preserve other per IO-scheduler settings on a switch,
> preserving _some_ settings is just confusing.

Ok. But this is more of a per queue setting (per cgroup, per device) and
not per IO scheduler one. That's a different thing that currently only CFQ
makes use of it.

If we start looking at them just as CFQ specific weigths, then it is a
different story. My thought process about these files was per cgroup per
device weights.

Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/