Re: [PATCH v2 5/6] x86/xen: Add a Xen-specific sync_core() implementation

From: Andy Lutomirski
Date: Fri Dec 02 2016 - 12:24:07 EST


On Fri, Dec 2, 2016 at 9:16 AM, Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote:
> On 02/12/16 17:07, Andy Lutomirski wrote:
>> On Dec 2, 2016 3:44 AM, "Andrew Cooper" <andrew.cooper3@xxxxxxxxxx> wrote:
>>> On 02/12/16 00:35, Andy Lutomirski wrote:
>>>> On Xen PV, CPUID is likely to trap, and Xen hypercalls aren't
>>>> guaranteed to serialize. (Even CPUID isn't *really* guaranteed to
>>>> serialize on Xen PV, but, in practice, any trap it generates will
>>>> serialize.)
>>> Well, Xen will enabled CPUID Faulting wherever it can, which is
>>> realistically all IvyBridge hardware and newer.
>>>
>>> All hypercalls are a privilege change to cpl0. I'd hope this condition
>>> is serialising, but I can't actually find any documentation proving or
>>> disproving this.
>> I don't know for sure. IRET is serializing, and if Xen returns using
>> IRET, we're fine.
>
> All returns to a 64bit PV guest at defined points (hypercall return,
> exception entry, etc) are from SYSRET, not IRET.

But CPUID faulting isn't like this, right? Unless Xen does
opportunistic SYSRET.

>
> Talking of, I still have a patch to remove
> PARAVIRT_ADJUST_EXCEPTION_FRAME which I need to complete and send upstream.
>
>>
>>>> On my laptop, CPUID(eax=1, ecx=0) is ~83ns and IRET-to-self is
>>>> ~110ns. But Xen PV will trap CPUID if possible, so IRET-to-self
>>>> should end up being a nice speedup.
>>>>
>>>> Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>>>> Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxx>
>>> CC'ing xen-devel and the Xen maintainers in Linux.
>>>
>>> As this is the only email from this series in my inbox, I will say this
>>> here, but it should really be against patch 6.
>>>
>>> A write to %cr2 is apparently (http://sandpile.org/x86/coherent.htm) not
>>> serialising on the 486, but I don't have a manual to hand to check.
>> I'll quote the (modern) SDM. For self-modifying code "The use of one
>> of these options is not required for programs intended to run on the
>> Pentium or Intel486 processors,
>> but are recommended to ensure compatibility with the P6 and more
>> recent processor families.". For cross-modifying code "The use of
>> this option is not required for programs intended to run on the
>> Intel486 processor, but is recommended
>> to ensure compatibility with the Pentium 4, Intel Xeon, P6 family, and
>> Pentium processors." So I'm not sure there's a problem.
>
> Fair enough. (Assuming similar properties hold for the older processors
> of other vendors.)

No, you were right -- a different section of the SDM contradicts it:

For Intel486 processors, a write to an instruction in the cache will
modify it in both the cache and memory, but if
the instruction was prefetched before the write, the old version of
the instruction could be the one executed. To
prevent the old instruction from being executed, flush the instruction
prefetch unit by coding a jump instruction
immediately after any write that modifies an instruction.

--Andy