From: Ross Zwisler
Date: Thu Apr 02 2015 - 16:31:30 EST

On Wed, 2015-02-18 at 16:29 -0800, tip-bot for Ross Zwisler wrote:
> Commit-ID: 3b68983dc66c61da3ab4191b891084a7ab09e3e1
> Gitweb:
> Author: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
> AuthorDate: Tue, 27 Jan 2015 09:53:51 -0700
> Committer: Ingo Molnar <mingo@xxxxxxxxxx>
> CommitDate: Thu, 19 Feb 2015 00:06:38 +0100
> x86: Add support for the clwb instruction
> Add support for the new clwb (cache line write back)
> instruction. This instruction was announced in the document
> "Intel Architecture Instruction Set Extensions Programming
> Reference" with reference number 319433-022.
> The clwb instruction is used to write back the contents of
> dirtied cache lines to memory without evicting the cache lines
> from the processor's cache hierarchy. This should be used in
> favor of clflushopt or clflush in cases where you require the
> cache line to be written to memory but plan to access the data
> again in the near future.
> One of the main use cases for this is with persistent memory
> where clwb can be used with pcommit to ensure that data has been
> accepted to memory and is durable on the DIMM.
> This function shows how to properly use clwb/clflushopt/clflush
> and pcommit with appropriate fencing:
> void flush_and_commit_buffer(void *vaddr, unsigned int size)
> {
> void *vend = vaddr + size - 1;
> for (; vaddr < vend; vaddr += boot_cpu_data.x86_clflush_size)
> clwb(vaddr);
> /* Flush any possible final partial cacheline */
> clwb(vend);
> /*
> * sfence to order clwb/clflushopt/clflush cache flushes
> * mfence via mb() also works
> */
> wmb();
> /* pcommit and the required sfence for ordering */
> pcommit_sfence();
> }
> After this function completes the data pointed to by vaddr is
> has been accepted to memory and will be durable if the vaddr
> points to persistent memory.
> Regarding the details of how the alternatives assembly is set
> up, we need one additional byte at the beginning of the clflush
> so that we can flip it into a clflushopt by changing that byte
> into a 0x66 prefix. Two options are to either insert a 1 byte
> ASM_NOP1, or to add a 1 byte NOP_DS_PREFIX. Both have no
> functional effect with the plain clflush, but I've been told
> that executing a clflush + prefix should be faster than
> executing a clflush + NOP.
> We had to hard code the assembly for clwb because, lacking the
> ability to assemble the clwb instruction itself, the next
> closest thing is to have an xsaveopt instruction with a 0x66
> prefix. Unfortunately xsaveopt itself is also relatively new,
> and isn't included by all the GCC versions that the kernel needs
> to support.
> Signed-off-by: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
> Acked-by: Borislav Petkov <bp@xxxxxxx>
> Acked-by: H. Peter Anvin <hpa@xxxxxxxxxxxxxxx>
> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Link:
> Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>

Ping on this patch - it looks like the pcommit patch is in the tip tree,
but this one is missing?

I'm looking at the tree as of:
9a760fbbdc7 "Merge branch 'tools/kvm'"

