RE: [PATCH v7 15/21] x86/split_lock: Add a sysfs interface to enable/disable split lock detection during run time

From: David Laight
Date: Wed Apr 24 2019 - 09:44:31 EST


From: Fenghua Yu
> Sent: 23 April 2019 21:48
>
> On Thu, Apr 18, 2019 at 08:41:30AM +0200, Thomas Gleixner wrote:
> > On Wed, 17 Apr 2019, Fenghua Yu wrote:
> > > On Thu, Apr 18, 2019 at 12:47:24AM +0200, Thomas Gleixner wrote:
> > > > On Wed, 17 Apr 2019, Fenghua Yu wrote:
> > > >
> > > > > The interface /sys/device/system/cpu/split_lock_detect is added
> > > > > to allow user to control split lock detection and show current split
> > > > > lock detection setting.
> > > > >
> > > > > Writing [yY1] or [oO][nN] to the file enables split lock detection and
> > > > > writing [nN0] or [oO][fF] disables split lock detection. Split lock
> > > > > detection is enabled or disabled on all CPUs.
> > > > >
> > > > > Reading the file returns current global split lock detection setting:
> > > > > 0: disabled
> > > > > 1: enabled
> > > >
> > > > Again, You explain WHAT this patch does and still there is zero
> > > > justification why this sysfs knob is needed at all. I still do not see any
> > > > reason why this knob should exist.
> > >
> > > An important application has split lock issues which are already discovered
> > > and need to be fixed. But before the issues are fixed, sysadmin still wants to
> > > run the application without rebooting the system, the sysfs knob can be useful
> > > to turn off split lock detection. After the application is done, split lock
> > > detection will be enabled again through the sysfs knob.
> >
> > Are you sure that you are talking about the real world? I might buy the
> > 'off' part somehow, but the 'on' part is beyond theoretical.
> >
> > Even the 'off' part is dubious on a multi user machine. I personally would
> > neither think about using the sysfs knob nor about rebooting the machine
> > simply because I'd consider a lock operation accross a cacheline an malicious
> > DoS attempt. Why would I allow that?
> >
> > So in reality the sysadmin will either move the workload to a machine w/o
> > the #AC magic or just tell the user to fix his crap.
> >
> > > Without the sysfs knob, sysadmin has to reboot the system with kernel option
> > > "no_split_lock_detect" to run the application before the split lock issues
> > > are fixed.
> > >
> > > Is this a valid justification why the sysfs knob is needed? If it is, I can
> > > add the justification in the next version.
> >
> > Why has this information not been in the changelog right away? I'm really
> > tired of asking the same questions and pointing you to
> > Documentation/process over and over.
>
> So should I remove the sysfs knob patches in the next version?
>
> Or add the following justification and still keep the sysfs knob patches?
> "To workaround or debug a split lock issue, the administrator may need to
> disable or enable split lock detection during run time without rebooting
> the system."

I've also not seen patches to fix all the places where 'lock bit' operations
get used on int [] data.
Testing had showed one structure that needed 'fixing', there are some others
that are in .bss/.data.
A kernel build could suddenly have them misaligned and crossing a cache line.

All the places that cast the pointer to the bit ops are suspect.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)