RE: [PATCH RFC] [x86] Optimize copy-page by reducing impact from HWprefetch

From: Ma, Ling
Date: Wed Jun 22 2011 - 21:02:09 EST


Yes, I also have tested 64bit atom, it got 11.6% improvement.
Because older CPU almost all use prefetch-next-line mechanism, the patch should be useful to them.

Thanks
Ling
> -----Original Message-----
> From: Ma, Ling
> Sent: Monday, June 20, 2011 11:43 AM
> To: Ma, Ling; mingo@xxxxxxx
> Cc: hpa@xxxxxxxxx; tglx@xxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
> Subject: RE: [PATCH RFC V2] [x86] Optimize copy-page by reducing impact
> from HW prefetch
>
> New experiment shows, for 4096 bytes no improvement on snb,
> 10~15% improvement on Core2, 11.6% improvement on 64bit atom.
>
> Thanks
> Ling

> -----Original Message-----
> From: Andi Kleen [mailto:andi@xxxxxxxxxxxxxx]
> Sent: Thursday, June 23, 2011 4:06 AM
> To: Ma, Ling
> Cc: mingo@xxxxxxx; hpa@xxxxxxxxx; tglx@xxxxxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH RFC] [x86] Optimize copy-page by reducing impact
> from HW prefetch
>
> ling.ma@xxxxxxxxx writes:
> > impact(DCU prefetcher), and simplify original code.
> > The performance is improved about 15% on core2, 36% on snb
> respectively.
> > (We use our micro-benchmark, and will do further test according to
> your requirment)
>
> This doesn't make a lot of sense because neither Core-2 nor SNB use the
> code path you patched. They all use the rep ; movs path
>
> -Andi
>
> --
> ak@xxxxxxxxxxxxxxx -- Speaking for myself only
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/