Re: [PATCH v4 01/17] x86/common: Align cpu_caps_cleared and cpu_caps_set to unsigned long

From: Paolo Bonzini
Date: Mon Mar 04 2019 - 03:40:01 EST


On 02/03/19 03:44, Fenghua Yu wrote:
> cpu_caps_cleared and cpu_caps_set may not be aligned to unsigned long.
> Atomic operations (i.e. set_bit and clear_bit) on the bitmaps may access
> two cache lines (a.k.a. split lock) and lock bus to block all memory
> accesses from other processors to ensure atomicity.
>
> To avoid the overall performance degradation from the bus locking, align
> the two variables to unsigned long.
>
> Defining the variables as unsigned long may also fix the issue because
> they are naturally aligned to unsigned long. But that needs additional
> code changes. Adding __aligned(unsigned long) are simpler fixes.
>
> Signed-off-by: Fenghua Yu <fenghua.yu@xxxxxxxxx>
> ---
> arch/x86/kernel/cpu/common.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index cb28e98a0659..51ab37ba5f64 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -488,8 +488,9 @@ static const char *table_lookup_model(struct cpuinfo_x86 *c)
> return NULL; /* Not found */
> }
>
> -__u32 cpu_caps_cleared[NCAPINTS + NBUGINTS];
> -__u32 cpu_caps_set[NCAPINTS + NBUGINTS];
> +/* Unsigned long alignment to avoid split lock in atomic bitmap ops */
> +__u32 cpu_caps_cleared[NCAPINTS + NBUGINTS] __aligned(sizeof(unsigned long));
> +__u32 cpu_caps_set[NCAPINTS + NBUGINTS] __aligned(sizeof(unsigned long));
>
> void load_percpu_segment(int cpu)
> {
>

(resending including the list)

Why not instead change set_bit/clear_bit to use btsl/btrl instead of
btsq/btrq?

Thanks,

Paolo