RE: [PATCH RFC] [x86] Optimize copy-page by reducing impact from HWprefetch

From: Ma, Ling
Date: Thu Jun 23 2011 - 22:09:55 EST


Yes, clean up patch is first.

> -----Original Message-----
> From: Ma, Ling
> Sent: Friday, June 24, 2011 10:01 AM
> To: 'Ingo Molnar'; Andi Kleen
> Cc: hpa@xxxxxxxxx; tglx@xxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
> Subject: RE: [PATCH RFC] [x86] Optimize copy-page by reducing impact
> from HW prefetch
>
> Sure, I separate two patches ASAP, one is for performance tuning code
> after some experiments,
> another code style patch.
>
> Thanks
> Ling
>
> > -----Original Message-----
> > From: Ingo Molnar [mailto:mingo@xxxxxxx]
> > Sent: Thursday, June 23, 2011 3:05 PM
> > To: Andi Kleen
> > Cc: Ma, Ling; hpa@xxxxxxxxx; tglx@xxxxxxxxxxxxx; linux-
> > kernel@xxxxxxxxxxxxxxx
> > Subject: Re: [PATCH RFC] [x86] Optimize copy-page by reducing impact
> > from HW prefetch
> >
> >
> > * Andi Kleen <andi@xxxxxxxxxxxxxx> wrote:
> >
> > > ling.ma@xxxxxxxxx writes:
> > >
> > > > impact(DCU prefetcher), and simplify original code. The
> > > > performance is improved about 15% on core2, 36% on snb
> > > > respectively. (We use our micro-benchmark, and will do further
> > > > test according to your requirment)
> > >
> > > This doesn't make a lot of sense because neither Core-2 nor SNB use
> > > the code path you patched. They all use the rep ; movs path
> >
> > Ling, mind double checking which one is the faster/better one on SNB,
> > in cold-cache and hot-cache situations, copy_page or copy_page_c?
> >
> > Also, while looking at this file please fix the countless pieces of
> > style excrements it has before modifying it:
> >
> > - non-Linux comment style (and needless two comments - it can
> > be in one comment block):
> >
> > /* Don't use streaming store because it's better when the target
> > ends up in cache. */
> >
> > /* Could vary the prefetch distance based on SMP/UP */
> >
> > - (there's other non-standard comment blocks in this file as well)
> >
> > - The copy_page/copy_page_c naming is needlessly obfuscated, it
> > should be copy_page, copy_page_norep or so - the _c postfix has no
> > obvious meaning.
> >
> > - all #include's should be at the top
> >
> > - please standardize it on the 'instrn %x, %y' pattern that we
> > generally use in arch/x86/, not 'instrn %x,%y' pattern.
> >
> > and do this cleanup patch first and the speedup on top of it, and
> > keep the two in two separate patches so that the modification to the
> > assembly code can be reviewed more easily.
> >
> > Thanks,
> >
> > Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/