Re: [PATCH] Use x86 SSE instructions for clear_page, copy_page

From: Arjan van de Ven
Date: Tue Aug 17 2004 - 02:29:23 EST


On Tue, 2004-08-17 at 08:13, Jens Maurer wrote:
> The attached patch (against kernel 2.6.8.1) enables using SSE
> instructions for copy_page and clear_page.
>
> A user-space test on my Pentium III 850 MHz shows a 3x speedup for
> clear_page (compared to the default "rep stosl"), and a 50% speedup
> for copy_page (compared to the default "rep movsl"). For a Pentium-4,
> the speedup is about 50% in both the clear_page and copy_page cases.


we used to have code like this in 2.4 but it got removed: the non
temperal store code is faster in a microbenchmark but has the
fundamental problem that it evics the data from the cpu cache; the
actual USE of the data thus is a LOT more expensive, result is that the
overall system performance goes down ;(

Attachment: signature.asc
Description: This is a digitally signed message part