Re: [RFC][PATCH -mm 0/5] cgroup: block device i/o controller (v9)

From: Andrea Righi
Date: Wed Sep 17 2008 - 04:48:19 EST


Hirokazu Takahashi wrote:
> Hi,
>
>> TODO:
>>
>> * Try to push down the throttling and implement it directly in the I/O
>> schedulers, using bio-cgroup (http://people.valinux.co.jp/~ryov/bio-cgroup/)
>> to keep track of the right cgroup context. This approach could lead to more
>> memory consumption and increases the number of dirty pages (hard/slow to
>> reclaim pages) in the system, since dirty-page ratio in memory is not
>> limited. This could even lead to potential OOM conditions, but these problems
>> can be resolved directly into the memory cgroup subsystem
>>
>> * Handle I/O generated by kswapd: at the moment there's no control on the I/O
>> generated by kswapd; try to use the page_cgroup functionality of the memory
>> cgroup controller to track this kind of I/O and charge the right cgroup when
>> pages are swapped in/out
>
> FYI, this also can be done with bio-cgroup, which determine the owner cgroup
> of a given anonymous page.
>
> Thanks,
> Hirokazu Takahashi

That would be great! FYI here is how I would like to proceed:

- today I'll post a new version of my cgroup-io-throttle patch rebased
to 2.6.27-rc5-mm1 (it's well tested and seems to be stable enough).
To keep the things light and simpler I've implemented custom
get_cgroup_from_page() / put_cgroup_from_page() in the memory
controller to retrieve the owner of a page, holding a reference to the
corresponding memcg, during async writes in submit_bio(); this is not
probably the best way to proceed, and a more generic framework like
bio-cgroup sounds better, but it seems to work quite well. The only
problem I've found is that during swap_writepage() the page is not
assigned to any page_cgroup (page_get_page_cgroup() returns NULL), and
so I'm not able to charge the cost of this I/O operation to the right
cgroup. Does bio-cgroup address or even resolve this issue?
- begin to implement a new branch of cgroup-io-throttle on top of
bio-cgroup
- also start to implement an additional request queue to provide first a
control at the cgroup level and a dispatcher to pass the request to
the elevator (as suggested by Vivek)

Thanks,
-Andrea
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/