Re: [RFC][PATCH -mm 0/5] cgroup: block device i/o controller (v9)

From: Takuya Yoshikawa
Date: Wed Sep 17 2008 - 05:02:20 EST


Hi,

Andrea Righi wrote:

TODO:

* Try to push down the throttling and implement it directly in the I/O
schedulers, using bio-cgroup (http://people.valinux.co.jp/~ryov/bio-cgroup/)
to keep track of the right cgroup context. This approach could lead to more
memory consumption and increases the number of dirty pages (hard/slow to
reclaim pages) in the system, since dirty-page ratio in memory is not
limited. This could even lead to potential OOM conditions, but these problems
can be resolved directly into the memory cgroup subsystem

* Handle I/O generated by kswapd: at the moment there's no control on the I/O
generated by kswapd; try to use the page_cgroup functionality of the memory
cgroup controller to track this kind of I/O and charge the right cgroup when
pages are swapped in/out

Could you explain which cgroup we should charge when swap in or out occurs?
Are there any difference between the following cases?

Target page is
1. used as page cache and not mapped to any space
2. used as page cache and mapped to some space
3. not used as page cache and mapped to some space

I do not think it is fair to charge the process for this kind of I/O, am I wrong?


* Improve fair throttling: distribute the time to sleep among all the tasks of
a cgroup that exceeded the I/O limits, depending of the amount of IO activity
generated in the past by each task (see task_io_accounting)

* Try to reduce the cost of calling cgroup_io_throttle() on every submit_bio();
this is not too much expensive, but the call of task_subsys_state() has
surely a cost. A possible solution could be to temporarily account I/O in the
current task_struct and call cgroup_io_throttle() only on each X MB of I/O.
Or on each Y number of I/O requests as well. Better if both X and/or Y can be
tuned at runtime by a userspace tool

* Think an alternative design for general purpose usage; special purpose usage
right now is restricted to improve I/O performance predictability and
evaluate more precise response timings for applications doing I/O. To a large
degree the block I/O bandwidth controller should implement a more complex
logic to better evaluate real I/O operations cost, depending also on the
particular block device profile (i.e. USB stick, optical drive, hard disk,
etc.). This would also allow to appropriately account I/O cost for seeky
workloads, respect to large stream workloads. Instead of looking at the
request stream and try to predict how expensive the I/O cost will be, a
totally different approach could be to collect request timings (start time /
elapsed time) and based on collected informations, try to estimate the I/O
cost and usage

-Andrea


Thanks,
Takuya Yoshikawa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/