Re: [PATCH] x86/vdso: Use non-serializing instruction rdtsc

From: Andy Lutomirski
Date: Tue May 16 2023 - 17:55:15 EST


On Mon, May 15, 2023, at 11:52 PM, Rong Tao wrote:
> From: Rong Tao <rongtao@xxxxxxxx>
>
> Replacing rdtscp or 'lfence;rdtsc' with the non-serializable instruction
> rdtsc can achieve a 40% performance improvement with only a small loss of
> precision.
>
> The RDTSCP instruction is not a serializing instruction, but it does wait
> until all previous instructions have executed and all previous loads are
> globally visible. The RDTSC instruction is not a serializing instruction.
> It does not necessarily wait until all previous instructions have been
> executed before reading the counter.
>
> Record the time-consuming of vdso clock_gettime(), pseudo code:
>
> count = 1000 * 1000 * 100;
> while (count--)
> clock_gettime(CLOCK_REALTIME, &ts);
>
> Time-consuming comparison:
>
> Time Consume(ns) | rdtsc_ordered() | rdtsc() | Promote
> ------------------+-----------------+-----------+---------
> Physical Machine | 1269147289 | 759067324 | 40%
> Guest OS (KVM) | 1756615963 | 995823886 | 43%
>
> Signed-off-by: Rong Tao <rongtao@xxxxxxxx>

Out of curiosity, what happens if you apply that patch and run this thing:

https://git.kernel.org/pub/scm/linux/kernel/git/luto/misc-tests.git/tree/evil-clock-test.cc

Build it with g++ -O2 and run:

./evil-clock-test -c monotonic

--Andy