Re: Page aging broken in 2.6

From: Benjamin Herrenschmidt
Date: Sat Dec 27 2003 - 00:04:34 EST



> Returning to the "how to flush the tlb after clearing the young bit", at
> least on the x86 I find more desiderable to flush based on mm (in UP
> that's the most efficient and it provides an accurate behaviour, in SMP
> it maybe still to costly but sure a lot less costly than a broadcast per
> pte). In 2.4 with the pagetable scan the flush per mm is
> strightforward and it provides a very high probability of optimizing
> away an huge lot of spurious IPI broadcast. But even in 2.6 the vm is
> unmapping stuff with some aggressive clustering algorithm so that when
> it starts umapping stuff it drops quite some stuff and there's still a
> relevant probability that only a few mm have to be flushed, which in SMP
> can decrease a lot the need of IPIs. Not sure how these flush_tlb_mm
> ideas translates for ppc though.

Since we use the hash as a TLB cache, we need to evict things from
it where you would do a flush_tlb. A flush_tlb_mm (or a range) is
fairly expensive. We have to calculate the hash value for each page
and evict them all. Also, the "nice" thing with this hash is since
we have the vsid's (kind of address space number), we can hold
many processes translations in there for a long time.

On the other hand, we don't need IPIs for any kind of flush (the
actual TLB flushes that we perform after evicting the hash entries
do broadcast in HW).

> The dirty and accessed bitflags instead are quite a different matter
> w.r.t to tlb flushing, we can't defer the tlb flush after atomically
> clearing the pte in smp while we clear the dirty bit. the tlb shootdown
> is the clustered version of that. the shootdown run a broadcast IPI
> not more than every 508 pte freed per mm. For the same reason we can try
> to coalesce the tlb flush post-clear-young with an mm flush, we can
> achieve a similar coalescing without the no need of an exact tlb
> shootdown like in the pte freeing



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/