Re: [PATCH v10 6/6] x86/split_lock: Enable split lock detection by kernel parameter

From: Fenghua Yu
Date: Thu Nov 21 2019 - 15:13:26 EST


On Thu, Nov 21, 2019 at 11:01:39AM -0800, Andy Lutomirski wrote:
>
> > On Nov 21, 2019, at 10:40 AM, Fenghua Yu <fenghua.yu@xxxxxxxxx> wrote:
> >
> > ïOn Thu, Nov 21, 2019 at 09:51:03AM -0800, Andy Lutomirski wrote:
> >>> On Thu, Nov 21, 2019 at 9:43 AM David Laight <David.Laight@xxxxxxxxxx> wrote:
> >>>
> >>> From: Ingo Molnar
> >>>> Sent: 21 November 2019 17:12
> >>>> * Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >>> ...
> >>>>> This feature MUST be default enabled, otherwise everything will
> >>>>> be/remain broken and we'll end up in the situation where you can't use
> >>>>> it even if you wanted to.
> >>>>
> >>>> Agreed.
> >>>
> >>> Before it can be enabled by default someone needs to go through the
> >>> kernel and fix all the code that abuses the 'bit' functions by using them
> >>> on int[] instead of long[].
> >>>
> >>> I've only seen one fix go through for one use case of one piece of code
> >>> that repeatedly uses potentially misaligned int[] arrays for bitmasks.
> >>>
> >>
> >> Can we really not just change the lock asm to use 32-bit accesses for
> >> set_bit(), etc? Sure, it will fail if the bit index is greater than
> >> 2^32, but that seems nuts.
> >>
> >> (Why the *hell* do the bitops use long anyway? They're *bit masks*
> >> for crying out loud. As in, users generally want to operate on fixed
> >> numbers of bits.)
> >
> > We are working on a separate patch set to fix all split lock issues
> > in atomic bitops. Per Peter Anvin and Tony Luck suggestions:
> > 1. Still keep the byte optimization if nr is constant. No split lock.
> > 2. If type of *addr is unsigned long, do quadword atomic instruction
> > on addr. No split lock.
> > 3. If type of *addr is unsigned int, do word atomic instruction
> > on addr. No split lock.
> > 4. Otherwise, re-calculate addr to point the 32-bit address which contains
> > the bit and operate on the bit. No split lock.
> >
> > Only small percentage of atomic bitops calls are in case 4 (e.g. 3%
> > for set_bit()) which need a few extra instructions to re-calculate
> > address but can avoid big split lock overhead.
> >
> > To get real type of *addr instead of type cast type "unsigned long",
> > the atomic bitops APIs are changed to macros from functions. This change
> > need to touch all architectures.
> >
>
> Isnât the kernel full of casts to long* to match the signature? Doing this based on type seems silly to me. I think itâs better to just to a 32-bit operation unconditionally and to try to optimize it
>using b*l when safe.

Actually we only find 8 places calling atomic bitops using type casting
"unsigned long *". After above changes, other 8 patches remove the type
castings and then split lock free in atomic bitops in the current kernel.

To check type casting in new patches, we add checkpatch.pl to warn on
any type casting on atomic bitops in new patches because the APIs are
marocs and gcc doesn't warn/issue error on type casting.

Using b*l will change the 8 places as well plus a lot of other places
where *addr is defined as "unsigned long *", right?

Thanks.

-Fenghua