Re: [PATCH] x86/vdso: Use non-serializing instruction rdtsc
From: Thomas Gleixner
Date: Tue May 16 2023 - 16:40:05 EST
On Tue, May 16 2023 at 10:57, H. Peter Anvin wrote:
> On May 16, 2023 7:12:34 AM PDT, Dave Hansen <dave.hansen@xxxxxxxxx> wrote:
>>On 5/15/23 23:52, Rong Tao wrote:
>>> Replacing rdtscp or 'lfence;rdtsc' with the non-serializable instruction
>>> rdtsc can achieve a 40% performance improvement with only a small loss of
>>> precision.
>>
>>I think the minimum that can be done in a changelog like this is to
>>figure out _why_ a RDTSCP was in use. There are a ton of things that
>>can make the kernel go faster, but not all of them are a good idea.
>>
>>I assume that the folks that wrote this had good reason for not using
>>plain RSTSC. What were those reasons?
>
> I believe the motivation is that it is atomic with reading the CPU number.
Believe belongs in the realm of religion and does not help much to
explain technical issues. :)
rdtsc_ordered() has actually useful comments and also see:
https://lore.kernel.org/lkml/87ttwc73za.ffs@tglx
The Intel SDM and the AMD APM are both blury about RDTSC speculation and
we've observed (quite some time ago) situations where the RDTSC value
was clearly from the past solely due to speculation. So we had to bite
the bullet to add the fencing. Preferrably RDTSCP or if not available
LFENCE; RDTSC. IIRC the original variant was even CPUID; RDTSC, which is
daft.
The time readout does (simplified):
do {
// Wait for the sequence count to become even
while ((seq = READ_ONCE(vd->seq)) & 1);
tsc = rdtsc_ordered();
now = convert(vd, tsc);
} while (seq != READ_ONCE(vd->seq));
It's obviously more complex than that, but you get the idea.
Now replace RDTSCP with RDTSC and explain what guarantees that
the TSC read isn't speculated ahead of the sequence check.
If it's architecturally guaranteed that this can't happen, I'm more than
happy to use plain RDTSC.
But as I've observed that myself in the past, I'm pretty sure that it is
not guaranteed, at least not on older microarchitectures. If newer ones
make that guarantee then they should have exposed that as a feature bit
in CPUID and clearly documented it in the SDM.
As long as that does not happen, I'm sticking to the correctness first
principle.
Thanks,
tglx