Re: [PATCH 3/4] powerpc32: memset(0): use cacheable_memzero

From: christophe leroy
Date: Thu May 14 2015 - 04:50:31 EST




Le 14/05/2015 02:55, Scott Wood a Ãcrit :
On Tue, 2015-05-12 at 15:32 +0200, Christophe Leroy wrote:
cacheable_memzero uses dcbz instruction and is more efficient than
memset(0) when the destination is in RAM

This patch renames memset as generic_memset, and defines memset
as a prolog to cacheable_memzero. This prolog checks if the byte
to set is 0 and if the buffer is in RAM. If not, it falls back to
generic_memcpy()

Signed-off-by: Christophe Leroy <christophe.leroy@xxxxxx>
---
arch/powerpc/lib/copy_32.S | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/lib/copy_32.S b/arch/powerpc/lib/copy_32.S
index cbca76c..d8a9a86 100644
--- a/arch/powerpc/lib/copy_32.S
+++ b/arch/powerpc/lib/copy_32.S
@@ -12,6 +12,7 @@
#include <asm/cache.h>
#include <asm/errno.h>
#include <asm/ppc_asm.h>
+#include <asm/page.h>
#define COPY_16_BYTES \
lwz r7,4(r4); \
@@ -74,6 +75,18 @@ CACHELINE_MASK = (L1_CACHE_BYTES-1)
* to set them to zero. This requires that the destination
* area is cacheable. -- paulus
*/
+_GLOBAL(memset)
+ cmplwi r4,0
+ bne- generic_memset
+ cmplwi r5,L1_CACHE_BYTES
+ blt- generic_memset
+ lis r8,max_pfn@ha
+ lwz r8,max_pfn@l(r8)
+ tophys (r9,r3)
+ srwi r9,r9,PAGE_SHIFT
+ cmplw r9,r8
+ bge- generic_memset
+ mr r4,r5
max_pfn includes highmem, and tophys only works on normal kernel
addresses.
Is there any other simple way to determine whether an address is in RAM or not ?

I did that because of the below function from mm/mem.c

|int page_is_ram(unsigned long pfn)
{
#ifndef CONFIG_PPC64 /* XXX for now */
return pfn< max_pfn;
#else
unsigned long paddr= (pfn<< PAGE_SHIFT);
struct memblock_region*reg;

for_each_memblock(memory, reg)
if (paddr>= reg->base&& paddr< (reg->base+ reg->size))
return 1;
return 0;
#endif
}
|




If we were to point memset_io, memcpy_toio, etc. at noncacheable
versions, are there any other callers left that can reasonably point at
uncacheable memory?
Do you mean we could just consider that memcpy() and memset() are called only with destination on RAM and thus we could avoid the check ?
copy_tofrom_user() already does this assumption (allthought a user app could possibly provide a buffer located in an ALSA mapped IO area)

Christophe


---
L'absence de virus dans ce courrier Ãlectronique a Ãtà vÃrifiÃe par le logiciel antivirus Avast.
http://www.avast.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/