Re: [PATCH 2/3] membarrier: Add an actual barrier before rseq_preempt()

From: Mathieu Desnoyers
Date: Tue Dec 01 2020 - 09:31:53 EST


----- On Dec 1, 2020, at 5:06 AM, Peter Zijlstra peterz@xxxxxxxxxxxxx wrote:

> On Mon, Nov 30, 2020 at 09:50:34AM -0800, Andy Lutomirski wrote:
>> It seems to be that most RSEQ membarrier users will expect any
>> stores done before the membarrier() syscall to be visible to the
>> target task(s). While this is extremely likely to be true in
>> practice, nothing actually guarantees it by a strict reading of the
>> x86 manuals. Rather than providing this guarantee by accident and
>> potentially causing a problem down the road, just add an explicit
>> barrier.
>
> A very long time ago; when Jens introduced smp_call_function(), we had
> this discussion. At the time Linus said that receiving an interrupt had
> better be ordering, and if it is not, then it's up to the architecture
> to handle that before it gets into the common code.
>
> https://lkml.kernel.org/r/alpine.LFD.2.00.0902180744520.21686@localhost.localdomain
>
> Maybe we want to revisit this now, but there might be a fair amount of
> code relying on all this by now.
>
> Documenting it better might help.

Considering that we already have this in membarrier ipi_mb :

static void ipi_mb(void *info)
{
smp_mb(); /* IPIs should be serializing but paranoid. */
}

I think it makes sense to add this same smp_mb() in the ipi_rseq if the expected
behavior is to order memory accesses as well, and have the same level of paranoia as
the ipi_mb.

Thanks,

Mathieu


>
>> Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxx>
>> ---
>> kernel/sched/membarrier.c | 8 ++++++++
>> 1 file changed, 8 insertions(+)
>>
>> diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c
>> index e23e74d52db5..7d98ef5d3bcd 100644
>> --- a/kernel/sched/membarrier.c
>> +++ b/kernel/sched/membarrier.c
>> @@ -40,6 +40,14 @@ static void ipi_mb(void *info)
>>
>> static void ipi_rseq(void *info)
>> {
>> + /*
>> + * Ensure that all stores done by the calling thread are visible
>> + * to the current task before the current task resumes. We could
>> + * probably optimize this away on most architectures, but by the
>> + * time we've already sent an IPI, the cost of the extra smp_mb()
>> + * is negligible.
>> + */
>> + smp_mb();
>> rseq_preempt(current);
>> }
>
> So I think this really isn't right.

--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com