Re: [RFC PATCH] introduce sys_membarrier(): process-wide memorybarrier (v5)

From: Mathieu Desnoyers
Date: Tue Jan 19 2010 - 22:13:32 EST


* Peter Zijlstra (peterz@xxxxxxxxxxxxx) wrote:
> On Tue, 2010-01-19 at 19:37 +0100, Peter Zijlstra wrote:
> > On Thu, 2010-01-14 at 14:33 -0500, Mathieu Desnoyers wrote:
> > > It's a case where CPU 1 switches from our mm to another mm:
> > >
> > > CPU 0 (membarrier) CPU 1 (another mm -our mm)
> > > <user-space> <user-space>
> > > <buffered access C.S. data>
> > > urcu read unlock()
> > > barrier()
> > > store local gp
> > > <kernel-space>
> >
> > OK, so the question is how we end up here, if its though interrupt
> > preemption I think the interrupt delivery will imply an mb,
>
> I keep thinking that, but I think we actually refuted that in an earlier
> discussion on this patch.

Intel Architecture Software Developer's Manual Vol. 3: System
Programming
7.4 Serializing Instructions

"MOV to control reg, MOV to debug reg, WRMSR, INVD, INVLPG, WBINDV, LGDT,
LLDT, LIDT, LTR, CPUID, IRET, RSM"

So, this list does _not_ include: INT, SYSENTER, SYSEXIT.

Only IRET is included. So I don't think it is safe to assume that x86
has serializing instructions when entering/leaving the kernel.

>
> > if its a
> > blocking syscall, the set_task_state() mb [*] should be there.
> >
> > Then we also do:
> >
> > clear_tsk_need_resched()
> >
> > which is an atomic bitop (although does not imply a full barrier
> > per-se).
> >
> > > rq->curr = next (1)
>
> We could possibly look at placing that assignment in context_switch()
> between switch_mm() and switch_to(), which should provide a mb before
> and after I think, Ingo?

That's an interesting idea. It would indeed fix the problem of the
missing barrier before the assignment, but would lack the appropriate
barrier after the assignment. If the rq->curr = next; assignment is made
after load_cr3, then we lack a memory barrier between the assignment and
execution of following user-space code after returning with SYSEXIT (and
we lack the appropriate barrier for other architectures too).

Thanks,

Mathieu

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/