Re: [PATCH v3 1/2] x86/vdso: Move mult and shift into struct vgtod_ts
From: Sverdlin, Alexander (Nokia - DE/Ulm)
Date: Thu Jun 27 2019 - 08:15:25 EST
Hi!
On 27/06/2019 14:07, Thomas Gleixner wrote:
>>>>> I'm in the process of merging that series and I actually adapted your
>>>>> scheme to the new unified infrastructure where it has exactly the same
>>>>> effects as with your original patches against the x86 version.
>>>> please let me know if I need to rework [2/2] based on some not-yet-published
>>>> branch of yours.
>>> I've pushed it out now to
>>>
>>> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git timers/vdso
>>>
>>> The generic VDSO library has the support for RAW already with that separate
>>> array. Testing would be welcomed!
>>
>> Thanks for your and Vincenzo's efforts!
>> I've applied the series onto 5.2.0-rc6 and did a quick test on a bare x86_64 and
>> for me it looks good:
>
> Did you use the git tree? If not, it would be interesting to have a test
> against that as well because that's the final version.
I've applied the following list:
32e29396f00e hrtimer: Split out hrtimer defines into separate header
361f8aee9b09 vdso: Define standardized vdso_datapage
00b26474c2f1 lib/vdso: Provide generic VDSO implementation
629fdf77ac45 lib/vdso: Add compat support
44f57d788e7d timekeeping: Provide a generic update_vsyscall() implementation
28b1a824a4f4 arm64: vdso: Substitute gettimeofday() with C implementation
98cd3c3f83fb arm64: vdso: Build vDSO with -ffixed-x18
53c489e1dfeb arm64: compat: Add missing syscall numbers
206c0dfa3c55 arm64: compat: Expose signal related structures
f14d8025d263 arm64: compat: Generate asm offsets for signals
a7f71a2c8903 arm64: compat: Add vDSO
c7aa2d71020d arm64: vdso: Refactor vDSO code
7c1deeeb0130 arm64: compat: VDSO setup for compat layer
1e3f17f55aec arm64: elf: VDSO code page discovery
f01703b3d2e6 arm64: compat: Get sigreturn trampolines from vDSO
bfe801ebe84f arm64: vdso: Enable vDSO compat support
7ac870747988 x86/vdso: Switch to generic vDSO implementation
f66501dc53e7 x86/vdso: Add clock_getres() entry point
22ca962288c0 (tip/WIP.vdso) x86/vdso: Add clock_gettime64() entry point
ecf9db3d1f1a x86/vdso: Give the [ph]vclock_page declarations real types
ed75e8f60bb1 vdso: Remove superfluous #ifdef __KERNEL__ in vdso/datapage.h
94fee4d43752 arm64: vdso: Remove unnecessary asm-offsets.c definitions
6a5b78b32d10 arm64: compat: No need for pre-ARMv7 barriers on an ARMv8 system
e70980312a94 MAINTAINERS: Add entry for the generic VDSO library
9d90b93bf325 lib/vdso: Make delta calculation work correctly
27e11a9fe2e2 arm64: Fix __arch_get_hw_counter() implementation
6241c4dc6ec5 arm64: compat: Fix __arch_get_hw_counter() implementation
3acf4be23528 (tip/timers/vdso) arm64: vdso: Fix compilation with clang older than 8
If you expect a difference, I can re-test using your tree as-is.
>> Number of clock_gettime() calls in 10 seconds:
>>
>> Before After Diff
>> MONOTONIC 152404300 200825950 +32%
>> MONOTONIC_RAW 38804788 198765053 +412%
>> REALTIME 151672619 201371468 +33%
>
> The increase for mono and realtime is impressive. Which CPU is that?
This time it was
processor : 3
vendor_id : AuthenticAMD
cpu family : 21
model : 96
model name : AMD PRO A10-8700B R6, 10 Compute Cores 4C+6G
stepping : 1
microcode : 0x600611a
cpu MHz : 2622.775
cache size : 1024 KB
physical id : 0
siblings : 4
core id : 3
cpu cores : 2
apicid : 19
initial apicid : 3
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good acc_power nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb bpext ptsc mwaitx cpb hw_pstate ssbd ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 xsaveopt arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic vgif overflow_recov
bugs : fxsave_leak sysret_ss_attrs null_seg spectre_v1 spectre_v2 spec_store_bypass
bogomips : 3594.00
TLB size : 1536 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 48 bits physical, 48 bits virtual
power management: ts ttp tm 100mhzsteps hwpstate cpb eff_freq_ro acc_power [13]
(it's different from the one I've used for my patches)
--
Best regards,
Alexander Sverdlin.