Re: [PATCH 1/5] x86/percpu: Differentiate this_cpu_{}() and __this_cpu_{}()

From: Nadav Amit
Date: Wed Feb 27 2019 - 12:57:38 EST


> On Feb 27, 2019, at 8:14 AM, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Wed, Feb 27, 2019 at 2:16 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>> Nadav Amit reported that commit:
>>
>> b59167ac7baf ("x86/percpu: Fix this_cpu_read()")
>>
>> added a bunch of constraints to all sorts of code; and while some of
>> that was correct and desired, some of that seems superfluous.
>
> Trivial (but entirely untested) patch attached.
>
> That said, I didn't actually check how it affects code generation.
> Nadav, would you check the code sequences you originally noticed?

The original issue was raised while I was looking into a dropped patch of
Matthew Wilcox that caused code size increase [1]. As a result I noticed
that Peterâs patch caused big changes to the generated assembly across the
kernel - I did not have a specific scenario that I cared about.

The patch you sent (â+m/-volatileâ) does increase the code size by 1728
bytes. Although code size is not the only metric for âcode optimizationâ,
the original patch of Peter (âvolatileâ) only increased the code size by 201
bytes. Peterâs original change also affected only 72 functions vs 228 that
impacted by the new patch.

Iâll have a look at some specific function assembly, but overall, the â+mâ
approach might prevent even more code optimizations than the âvolatileâ one.

Iâll send an example or two later.

Regards,
Nadav


[1] https://marc.info/?l=linux-mm&m=154341370216693&w=2