Re: [PATCHv4 26/28] x86/vdso: Align VDSO functions by CPU L1 cache line

From: Thomas Gleixner
Date: Fri Jun 14 2019 - 10:18:40 EST


On Wed, 12 Jun 2019, Dmitry Safonov wrote:

> From: Andrei Vagin <avagin@xxxxxxxxx>
>
> After performance testing VDSO patches a noticeable 20% regression was
> found on gettime_perf selftest with a cold cache.
> As it turns to be, before time namespaces introduction, VDSO functions
> were quite aligned to cache lines, but adding a new code to adjust
> timens offset inside namespace created a small shift and vdso functions
> become unaligned on cache lines.
>
> Add align to vdso functions with gcc option to fix performance drop.
>
> Coping the resulting numbers from cover letter:
>
> Hot CPU cache (more gettime_perf.c cycles - the better):
> | before | CONFIG_TIME_NS=n | host | inside timens
> --------|------------|------------------|-------------|-------------
> cycles | 139887013 | 139453003 | 139899785 | 128792458
> diff (%)| 100 | 99.7 | 100 | 92

Why is CONFIG_TIME_NS=n behaving worse than current mainline and
worse than 'host' mode?

> Cold cache (lesser tsc per gettime_perf_cold.c cycle - the better):
> | before | CONFIG_TIME_NS=n | host | inside timens
> --------|------------|------------------|-------------|-------------
> tsc | 6748 | 6718 | 6862 | 12682
> diff (%)| 100 | 99.6 | 101.7 | 188

Weird, now CONFIG_TIME_NS=n is better than current mainline and 'host' mode
drops.

Either I'm misreading the numbers or missing something or I'm just confused
as usual :)

Thanks,

tglx