Re: [PATCH v2] sched/membarrier: Fix redundant load of membarrier_state

From: Michael Ellerman
Date: Tue Oct 29 2024 - 19:29:52 EST


"Nysal Jan K.A." <nysal@xxxxxxxxxxxxx> writes:
> On architectures where ARCH_HAS_SYNC_CORE_BEFORE_USERMODE
> is not selected, sync_core_before_usermode() is a no-op.
> In membarrier_mm_sync_core_before_usermode() the compiler does not
> eliminate redundant branches and load of mm->membarrier_state
> for this case as the atomic_read() cannot be optimized away.
>
> Here's a snippet of the code generated for finish_task_switch() on powerpc
> prior to this change:
>
> 1b786c: ld r26,2624(r30) # mm = rq->prev_mm;
> .......
> 1b78c8: cmpdi cr7,r26,0
> 1b78cc: beq cr7,1b78e4 <finish_task_switch+0xd0>
> 1b78d0: ld r9,2312(r13) # current
> 1b78d4: ld r9,1888(r9) # current->mm
> 1b78d8: cmpd cr7,r26,r9
> 1b78dc: beq cr7,1b7a70 <finish_task_switch+0x25c>
> 1b78e0: hwsync
> 1b78e4: cmplwi cr7,r27,128
> .......
> 1b7a70: lwz r9,176(r26) # atomic_read(&mm->membarrier_state)
> 1b7a74: b 1b78e0 <finish_task_switch+0xcc>

Reviewed-by: Michael Ellerman <mpe@xxxxxxxxxxxxxx>

cheers