Re: [PATCH 2/3] mm/page_ref: Ensure page_ref_unfreeze is ordered against prior accesses

From: Will Deacon
Date: Thu Jun 08 2017 - 07:24:30 EST

Next message: Paolo Bonzini: "Re: [PATCH 2/4] KVM: VMX: avoid double list add with VT-d posted interrupts"
Previous message: Martin Schwidefsky: "Re: [PATCH RFC 0/2] KVM: s390: avoid having to enable vm.alloc_pgste"
In reply to: Vlastimil Babka: "Re: [PATCH 2/3] mm/page_ref: Ensure page_ref_unfreeze is ordered against prior accesses"
Next in thread: Peter Zijlstra: "Re: [PATCH 2/3] mm/page_ref: Ensure page_ref_unfreeze is ordered against prior accesses"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

[+ PeterZ]

On Thu, Jun 08, 2017 at 01:07:02PM +0200, Vlastimil Babka wrote:
> On 06/08/2017 12:40 PM, Kirill A. Shutemov wrote:
> > On Thu, Jun 08, 2017 at 11:38:21AM +0200, Vlastimil Babka wrote:
> >> On 06/06/2017 07:58 PM, Will Deacon wrote:
> >>> include/linux/page_ref.h | 1 +
> >>> 1 file changed, 1 insertion(+)
> >>>
> >>> diff --git a/include/linux/page_ref.h b/include/linux/page_ref.h
> >>> index 610e13271918..74d32d7905cb 100644
> >>> --- a/include/linux/page_ref.h
> >>> +++ b/include/linux/page_ref.h
> >>> @@ -174,6 +174,7 @@ static inline void page_ref_unfreeze(struct page *page, int count)
> >>> VM_BUG_ON_PAGE(page_count(page) != 0, page);
> >>> VM_BUG_ON(count == 0);
> >>>
> >>> + smp_mb__before_atomic();
> >>> atomic_set(&page->_refcount, count);
> >
> > I *think* it should be smp_mb(), not __before_atomic(). atomic_set() is
> > not really atomic. For instance on x86 it's plain WRITE_ONCE() which CPU
> > would happily reorder.
>
> Yeah but there are compile barriers, and x86 is TSO, so that's enough?
> Also I found other instances by git grep (not a proof, though :)

I think it boils down to whether:

smp_mb__before_atomic();
atomic_set();

should have the same memory ordering semantics as:

smp_mb();
atomic_set();

which it doesn't with the x86 implementation AFAICT.

The horribly out-of-date atomic_ops.txt isn't so useful:

| If a caller requires memory barrier semantics around an atomic_t
| operation which does not return a value, a set of interfaces are
| defined which accomplish this::
|
| void smp_mb__before_atomic(void);
| void smp_mb__after_atomic(void);
|
| For example, smp_mb__before_atomic() can be used like so::
|
| obj->dead = 1;
| smp_mb__before_atomic();
| atomic_dec(&obj->ref_count);
|
| It makes sure that all memory operations preceding the atomic_dec()
| call are strongly ordered with respect to the atomic counter
| operation. In the above example, it guarantees that the assignment of
| "1" to obj->dead will be globally visible to other cpus before the
| atomic counter decrement.
|
| Without the explicit smp_mb__before_atomic() call, the
| implementation could legally allow the atomic counter update visible
| to other cpus before the "obj->dead = 1;" assignment.

which makes it sound more like the barrier is ordering all prior accesses
against the atomic operation itself (without going near cumulativity...),
and not with respect to anything later in program order.

Anyway, I think that's sufficient for what we want here, but we should
probably iron out the semantics of this thing.

Will

Next message: Paolo Bonzini: "Re: [PATCH 2/4] KVM: VMX: avoid double list add with VT-d posted interrupts"
Previous message: Martin Schwidefsky: "Re: [PATCH RFC 0/2] KVM: s390: avoid having to enable vm.alloc_pgste"
In reply to: Vlastimil Babka: "Re: [PATCH 2/3] mm/page_ref: Ensure page_ref_unfreeze is ordered against prior accesses"
Next in thread: Peter Zijlstra: "Re: [PATCH 2/3] mm/page_ref: Ensure page_ref_unfreeze is ordered against prior accesses"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]