Re: [PATCH] vsyscall: use __iter_div_u64_rem()
From: Jan Beulich
Date: Mon Jul 22 2019 - 06:39:37 EST
On 22.07.2019 12:10, Thomas Gleixner wrote:
> On Thu, 11 Jul 2019, Arnd Bergmann wrote:
>
> Trimmed CC list and added Jan
>
>> See below for the patch I am using locally to work around this.
>> That patch is probably wrong, so I have not submitted it yet, but it
>> gives you a clean build ;-)
>>
>> Arnd
>> 8<---
>> Subject: [PATCH] x86: percpu: fix clang 32-bit build
>>
>> clang does not like an inline assembly with a "=q" contraint for
>> a 64-bit output:
>>
>> arch/x86/events/perf_event.h:824:21: error: invalid output size for
>> constraint '=q'
>> u64 disable_mask = __this_cpu_read(cpu_hw_events.perf_ctr_virt_mask);
>> ^
>> include/linux/percpu-defs.h:447:2: note: expanded from macro '__this_cpu_read'
>> raw_cpu_read(pcp); \
>> ^
>> include/linux/percpu-defs.h:421:28: note: expanded from macro 'raw_cpu_read'
>> #define raw_cpu_read(pcp)
>> __pcpu_size_call_return(raw_cpu_read_, pcp)
>> ^
>> include/linux/percpu-defs.h:322:23: note: expanded from macro
>> '__pcpu_size_call_return'
>> case 1: pscr_ret__ = stem##1(variable); break; \
>> ^
>> <scratch space>:357:1: note: expanded from here
>> raw_cpu_read_1
>> ^
>> arch/x86/include/asm/percpu.h:394:30: note: expanded from macro 'raw_cpu_read_1'
>> #define raw_cpu_read_1(pcp) percpu_from_op(, "mov", pcp)
>> ^
>> arch/x86/include/asm/percpu.h:189:15: note: expanded from macro 'percpu_from_op'
>> : "=q" (pfo_ret__) \
>> ^
>>
>> According to the commit that introduced the "q" constraint, this was
>> needed to fix miscompilation, but it gives no further detail.
>
> Jan, do you have any memory why you added those 'q' constraints? The
> changelog of 3c598766a2ba is not really helpful.
"q" was used in that commit exclusively for byte sized operands, simply
because that _is_ the constraint to use in such cases. Using "r" is
wrong on 32-bit, as it would include inaccessible byte portions of %sp,
%bp, %si, and %di. This is how it's described in gcc sources / docs:
"Any register accessible as @code{@var{r}l}. In 32-bit mode, @code{a},
@code{b}, @code{c}, and @code{d}; in 64-bit mode, any integer register."
What I'm struggling with is why clang would evaluate that asm() in the
first place when a 64-bit field (perf_ctr_virt_mask) is being accessed.
Jan