On Fri, Jul 24, 2009 at 12:21:50PM +0200, Thomas Hellstrom wrote:
Andi Kleen wrote:
Thomas Hellstrom <thellstrom@xxxxxxxxxx> writes:Admittedly.
The current code uses wbinvd() when the area to flush is > 4MB. Although thismay be? You seem to miss some hard data here.
may be faster than using clflush() the effect of wbinvd() on irq latencies
may be catastrophical on systems with large caches. Therefore use clflush()
So was it motivated by a real problem?
OK. We should really test this at some point. I currently don't have the hardware to do so.However, the concept of flushing and invalidating the caches completely on systems with many
processors and huge caches when we intend to only flush only small piece of the cache also sounds like a big overkill.
The other CPUs will not block (just flush their caches in the background or
in parallel), so the latency shouldn't scale with the number of sockets.
Also number of cores also shouldn't impact it because these tend
to have shared cache hierarchies.
That's just a theory, but not necessarily a worse one than yours :-)
Furthermore, since the wbinvd() has been introduced as an optimization of the general clflush() case, did somebody ever check the effects on systems with many processors and huge caches?
Typically systems with large caches flush faster too.
-Andi