Re: [PATCH 1/1] ppc/crash: Skip spinlocks during crash
From: Christophe Leroy
Date: Sat Mar 28 2020 - 06:21:45 EST
Hi Leonardo,
On 03/27/2020 03:51 PM, Leonardo Bras wrote:
Hello Christophe, thanks for the feedback.
I noticed an error in this patch and sent a v2, that can be seen here:
http://patchwork.ozlabs.org/patch/1262468/
Comments inline::
On Fri, 2020-03-27 at 07:50 +0100, Christophe Leroy wrote:
@@ -142,6 +144,8 @@ static inline void arch_spin_lock(arch_spinlock_t *lock)
if (likely(__arch_spin_trylock(lock) == 0))
break;
do {
+ if (unlikely(crash_skip_spinlock))
+ return;
Complete function for reference:
static inline void arch_spin_lock(arch_spinlock_t *lock)
{
while (1) {
if (likely(__arch_spin_trylock(lock) == 0))
break;
do {
if (unlikely(crash_skip_spinlock))
return;
HMT_low();
if (is_shared_processor())
splpar_spin_yield(lock);
} while (unlikely(lock->slock != 0));
HMT_medium();
}
}
You are adding a test that reads a global var in the middle of a so hot
path ? That must kill performance.
I thought it would, in worst case scenario, increase a maximum delay of
an arch_spin_lock() call 1 spin cycle. Here is what I thought:
- If the lock is already free, it would change nothing,
- Otherwise, the lock will wait.
- Waiting cycle just got bigger.
- Worst case scenario: running one more cycle, given lock->slock can
turn to 0 just after checking.
Could you please point where I failed to see the performance penalty?
(I need to get better at this :) )
You are right that when the lock is free, it changes nothing. However
when it is not, it is not just one cycle.
Here is arch_spin_lock() without your patch:
00000440 <my_lock>:
440: 39 40 00 01 li r10,1
444: 7d 20 18 28 lwarx r9,0,r3
448: 2c 09 00 00 cmpwi r9,0
44c: 40 82 00 10 bne 45c <my_lock+0x1c>
450: 7d 40 19 2d stwcx. r10,0,r3
454: 40 a2 ff f0 bne 444 <my_lock+0x4>
458: 4c 00 01 2c isync
45c: 2f 89 00 00 cmpwi cr7,r9,0
460: 4d be 00 20 bclr+ 12,4*cr7+eq
464: 7c 21 0b 78 mr r1,r1
468: 81 23 00 00 lwz r9,0(r3)
46c: 2f 89 00 00 cmpwi cr7,r9,0
470: 40 be ff f4 bne cr7,464 <my_lock+0x24>
474: 7c 42 13 78 mr r2,r2
478: 7d 20 18 28 lwarx r9,0,r3
47c: 2c 09 00 00 cmpwi r9,0
480: 40 82 00 10 bne 490 <my_lock+0x50>
484: 7d 40 19 2d stwcx. r10,0,r3
488: 40 a2 ff f0 bne 478 <my_lock+0x38>
48c: 4c 00 01 2c isync
490: 2f 89 00 00 cmpwi cr7,r9,0
494: 40 be ff d0 bne cr7,464 <my_lock+0x24>
498: 4e 80 00 20 blr
Here is arch_spin_lock() with your patch. I enclose with === what comes
in addition:
00000440 <my_lock>:
440: 39 40 00 01 li r10,1
444: 7d 20 18 28 lwarx r9,0,r3
448: 2c 09 00 00 cmpwi r9,0
44c: 40 82 00 10 bne 45c <my_lock+0x1c>
450: 7d 40 19 2d stwcx. r10,0,r3
454: 40 a2 ff f0 bne 444 <my_lock+0x4>
458: 4c 00 01 2c isync
45c: 2f 89 00 00 cmpwi cr7,r9,0
460: 4d be 00 20 bclr+ 12,4*cr7+eq
=====================================================
464: 3d 40 00 00 lis r10,0
466: R_PPC_ADDR16_HA crash_skip_spinlock
468: 39 4a 00 00 addi r10,r10,0
46a: R_PPC_ADDR16_LO crash_skip_spinlock
46c: 39 00 00 01 li r8,1
470: 89 2a 00 00 lbz r9,0(r10)
474: 2f 89 00 00 cmpwi cr7,r9,0
478: 4c 9e 00 20 bnelr cr7
=====================================================
47c: 7c 21 0b 78 mr r1,r1
480: 81 23 00 00 lwz r9,0(r3)
484: 2f 89 00 00 cmpwi cr7,r9,0
488: 40 be ff f4 bne cr7,47c <my_lock+0x3c>
48c: 7c 42 13 78 mr r2,r2
490: 7d 20 18 28 lwarx r9,0,r3
494: 2c 09 00 00 cmpwi r9,0
498: 40 82 00 10 bne 4a8 <my_lock+0x68>
49c: 7d 00 19 2d stwcx. r8,0,r3
4a0: 40 a2 ff f0 bne 490 <my_lock+0x50>
4a4: 4c 00 01 2c isync
4a8: 2f 89 00 00 cmpwi cr7,r9,0
4ac: 40 be ff c4 bne cr7,470 <my_lock+0x30>
4b0: 4e 80 00 20 blr
Christophe