RE: [PATCH v2] x86, pmem: fix broken __copy_user_nocache cache-bypass assumptions

From: Kani, Toshimitsu
Date: Mon Apr 10 2017 - 17:45:58 EST


> On Mon, Apr 10, 2017 at 2:28 PM, Kani, Toshimitsu <toshi.kani@xxxxxxx>
> wrote:
> >> > Thanks for the update. I think the alignment check should be based on
> >> > the following note in copy_user_nocache.
> >> >
> >> > * Note: Cached memory copy is used when destination or size is not
> >> > * naturally aligned. That is:
> >> > * - Require 8-byte alignment when size is 8 bytes or larger.
> >> > * - Require 4-byte alignment when size is 4 bytes.
> >> >
> >> > So, I think the code may be something like this. I also made the following
> >> changes:
> >>
> >> Thanks!
> >>
> >> > - Mask with 7, not 8.
> >>
> >> Yes, good catch.
> >>
> >> > - ALIGN with cacheline size, instead of 8.
> >> > - Add (bytes > flushed) test since calculation with unsigned long still
> results
> >> in a negative
> >> > value (as a positive value).
> >> >
> >> > if (bytes < 8) {
> >> > if ((dest & 3) || (bytes != 4))
> >> > arch_wb_cache_pmem(addr, 1);
> >> > } else {
> >> > if (dest & 7) {
> >> > dest = ALIGN(dest, boot_cpu_data.x86_clflush_size);
> >>
> >> Why align the destination to the next cacheline? As far as I can see
> >> the ALIGN_DESTINATION macro in arch/x86/include/asm/asm.h only aligns
> >> to the next 8-byte boundary.
> >
> > The clflush here flushes for the cacheline size. So, we do not need to flush
> > the same cacheline again when the unaligned tail is in the same line.
>
> Ok, makes sense. Last question, can't we reduce the check to be:
>
> if ((bytes > flushed) && ((bytes - flushed) & 3))
>
> ...since if 'bytes' was 4-byte aligned we would have performed
> non-temporal stores.

That is not documented behavior of copy_user_nocache, but as long as the pmem
version of copy_user_nocache follows the same implemented behavior, yes, that
works.

> Can I add your Signed-off-by: on v3?

Sure.

Thanks,
-Toshi