Re: [patch 0/7] cpuset writeback throttling

From: Andrew Morton
Date: Wed Nov 05 2008 - 13:42:39 EST


On Wed, 5 Nov 2008 07:52:44 -0600 (CST)
Christoph Lameter <cl@xxxxxxxxxxxxxxxxxxxx> wrote:

> On Tue, 4 Nov 2008, Andrew Morton wrote:
>
> >> That is one aspect. When performing writeback then we need to figure out
> >> which inodes have dirty pages in the memcg and we need to start writeout
> >> on those inodes and not on others that have their dirty pages elsewhere.
> >> There are two components of this that are in this patch and that would
> >> also have to be implemented for a memcg.
> >
> > Doable. lru->page->mapping->host is a good start.
>
> The block layer has a list of inodes that are dirty. From that we need to
> select ones that will improve the situation from the cpuset/memcg. How
> does the LRU come into this?

In the simplest case, dirty-memory throttling can just walk the LRU
writing back pages in the same way that kswapd does.

There would probably be performance benefits in doing
address_space-ordered writeback, so the dirty-memory throttling could
pick a dirty page off the LRU, go find its inode and then feed that
into __sync_single_inode().

> >> This patch would solve the problem if the calculation of the dirty pages
> >> would consider the active memcg and be able to determine the amount of
> >> dirty pages (through some sort of additional memcg counters). That is just
> >> the first part though. The second part of finding the inodes that have
> >> dirty pages for writeback would require an association between memcgs and
> >> inodes.
> >
> > We presently have that via the LRU. It has holes, but so does this per-cpuset
> > scheme.
>
> How do I get to the LRU from the dirtied list of inodes?

Don't need it.

It'll be approximate and has obvious scenarios of great inaccuraracy
but it'll suffice for the workloads which this patchset addresses.



It sounds like any memcg-based approach just won't be suitable for the
people who are hitting this problem.

But _are_ people hitting this problem? I haven't seen any real-looking
reports in ages. Is there some workaround? If so, what is it? How
serious is this problem now?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/