Re: Page aging broken in 2.6

From: Benjamin Herrenschmidt
Date: Fri Dec 26 2003 - 19:48:38 EST


On Sat, 2003-12-27 at 11:35, Linus Torvalds wrote:
> On Sat, 27 Dec 2003, Benjamin Herrenschmidt wrote:
> >
> > Or do what I propose here, that is have ptep_test_and_clear_* be
> > responsible for the flush on archs where it is necessary, but then
> > it would be nice to have more than the ptep as an argument...
>
> The dirty handling already does the TLB flush (in that case it's a
> correctness issue, not a hint). So it's only ptep_test_and_clear_young()
> that matters.

Yes, but it would be possible to optimize it some way on our
beloved hash tables ;) (By marking the entry read-only in the
hash instead of evicting it). Maybe not worth the pain though...

> I don't know whather that ever ends up being performance-critical, and I
> don't see what else could be passed into it. We literally don't _have_
> anythign else than the pte.

Ok, figured that out.

> But the ppc architecture could easily decide to walk the hash tables and
> invalidate in ptep_test_and_clear_young(). And if it ends up being a
> performance issue, it _appears_ that all users of "page_referenced()"
> (which is the only thing that does this) are actually using the return
> value as just a boolean. And it's entirely possible that we should break
> out of "page_referenced()" on the _first_ hit of "yes, this has been
> referenced".

Except that we may expect all "referencing" PTEs to have the accessed
bit cleared, no ? Or if we have lots of users we'll end up getting lots
of positive results while after the page was actually referenced... I
don't know if this would be a real problem though.

> That would make it much less CPU-intensive to make
> "ptep_test_and_clear_young()" slightly heavier to execute. It would also
> cause "page_referenced()" to not clear _all_ mapped reference bits at the
> same time - which might unfairly cause multi-used pages to stay in memory.
> On the other hand, that might be the _right_ behaviour.
>
> Rik? Andrea?
>
> Worth testing, perhaps.

Ok, right now, Anton is testing a patch from paulus where we do our
own flush batching and do the flush inside ptep_test_and_clear_* That
will at least fix the problem for us now.

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/