Re: [PATCH 01/16] mm: delayed page activation

From: Nikita Danilov
Date: Wed Dec 07 2005 - 07:43:29 EST


Wu Fengguang writes:
> On Tue, Dec 06, 2005 at 08:55:13PM +0300, Nikita Danilov wrote:

[...]

> >
> > But this change increased lifetimes of _all_ pages, so this is
>
> Yes, it also increased the lifetimes by meaningful values: first re-accessed
> pages are prolonged more lifetime. Immediately removing them from inactive_list
> is basicly doing MRU eviction.

Are you talking about CLOCK-pro here? I don't understand your statement
in the context of current VM: if the "first re-accessed" page was close
to the cold tail of the inactive list, and "second re-accessed" page was
close to the head of the inactive list, then life-time of second one is
increased by larger amount.

>
> > irrelevant. Consequently, it has a chance of increasing scanning
> > activity, because there will be more referenced pages at the cold tail
> > of the inactive list.
>
> Delayed activation increased scanning activity, while immediate activation
> increased the locking activity. Early profiling data on a 2 CPU Xeon box showed
> that the delayed activation acctually cost less time.

That's great, but current mark_page_accessed() has an important
advantage: the work is done by the process that accessed the page in
read/write path, or at page fault. By delegating activation to the VM
scanner, the burden of work is shifted to the innocent thread that
happened to trigger scanning during page allocation.

It's basically always useful to follow the principle that the thread
that used resources pays the overhead. This leads to more balanced
behavior with better worst-case.

Compare this with balance_dirty_pages() that throttles heavy writers by
forcing them to do the write-out. In the same vein, mark_page_accessed()
throttles thread (a bit) by forcing it to book-keep VM lists. Ideally,
threads doing a lot of page allocations, should be forced to do some
scanning, even if there is no memory shortage, etc.

As for locking overhead, mark_page_accessed() can be batched (I even
have a patch to do this, but it doesn't show any improvement).

>
> Thanks,
> Wu

Nikita.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/