On Sat, 18 Jun 2016 22:02:21 +0200 Manfred Spraul <manfred@xxxxxxxxxxxxxxxx> wrote:I had several ideas how to fix it. The initial ideas probably had performance issue.
Commit 6d07b68ce16a ("ipc/sem.c: optimize sem_lock()") introduced a race:I've had this in -mm (and -next) since January 4, without issues. I
sem_lock has a fast path that allows parallel simple operations.
There are two reasons why a simple operation cannot run in parallel:
- a non-simple operations is ongoing (sma->sem_perm.lock held)
- a complex operation is sleeping (sma->complex_count != 0)
As both facts are stored independently, a thread can bypass the current
checks by sleeping in the right positions. See below for more details
(or kernel bugzilla 105651).
The patch fixes that by creating one variable (complex_mode)
that tracks both reasons why parallel operations are not possible.
The patch also updates stale documentation regarding the locking.
With regards to stable kernels:
The patch is required for all kernels that include the commit 6d07b68ce16a
("ipc/sem.c: optimize sem_lock()") (3.10?)
put it on hold because Davidlohr expressed concern about performance
regressions.
Your [2/2] should prevent those regressions (yes?) so I assume that any[2/2] is an improvement, it handles one case better than the current code.
kernel which has [1/2] really should have [2/2] as well. But without
any quantitative information, this is all mad guesswork.
What to do?