Re: [RFC v1] add new io-scheduler to use cgroup on high-speed device

From: Vivek Goyal
Date: Wed Jun 05 2013 - 10:46:52 EST

On Tue, Jun 04, 2013 at 08:03:37PM -0700, Tejun Heo wrote:
> (cc'ing Kent. Original posting at
> )
> Hello,
> On Wed, Jun 05, 2013 at 10:09:31AM +0800, Robin Dong wrote:
> > We want to use blkio.cgroup on high-speed device (like fusionio) for our mysql clusters.
> > After testing different io-scheduler, we found that cfq is too slow and deadline can't run on cgroup.
> > So we developed a new io-scheduler: tpps (Tiny Parallel Proportion Scheduler).It dispatch requests
> > only by using their individual weight and total weight (proportion) therefore it's simply and efficient.
> >
> > Test case: fusionio card, 4 cgroups, iodepth-512
> So, while I understand the intention behind it, I'm not sure a
> separate io-sched for this is what we want. Kent and Jens have been
> thinking about this lately so they'll probably chime in. From my POV,
> I see a few largish issues.
> * It has to be scalable with relatively large scale SMP / NUMA
> configurations. It better integrate with blk-mq support currently
> being brewed.
> * It definitely has to support hierarchy. Nothing which doesn't
> support full hierarchy can be added to cgroup at this point.
> * We already have separate implementations in blk-throtl and
> cfq-iosched. Maybe it's too late and too different for cfq-iosched
> given that it's primarily targeted at disks, but I wonder whether we
> can make blk-throtl generic and scalable enough to cover all other
> use cases.

A generic implementation at block layer also has the advantage that we
can use it for any block device which are not using IO scheduler (dm,md)
and we can enforce the algorithm higher up in the stack.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at