Re: [PATCH 0/2] fix vma->anon_vma check for per-VMA locking; fix anon_vma memory ordering

From: Paul E. McKenney
Date: Wed Jul 26 2023 - 19:19:19 EST

Next message: Paul Gofman: "Re: [v3] fs/proc/task_mmu: Implement IOCTL for efficient page table scanning"
Previous message: Randy Dunlap: "Re: Request for linux-kselftest nolibc branch Inclusion in linux-next"
In reply to: Linus Torvalds: "Re: [PATCH 2/2] mm: Fix anon_vma memory ordering"
Next in thread: Jann Horn: "Re: [PATCH 0/2] fix vma->anon_vma check for per-VMA locking; fix anon_vma memory ordering"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, Jul 26, 2023 at 11:41:01PM +0200, Jann Horn wrote:
> Hi!
>
> Patch 1 here is a straightforward fix for a race in per-VMA locking code
> that can lead to use-after-free; I hope we can get this one into
> mainline and stable quickly.
>
> Patch 2 is a fix for what I believe is a longstanding memory ordering
> issue in how vma->anon_vma is used across the MM subsystem; I expect
> that this one will have to go through a few iterations of review and
> potentially rewrites, because memory ordering is tricky.
> (If someone else wants to take over patch 2, I would be very happy.)
>
> These patches don't really belong together all that much, I'm just
> sending them as a series because they'd otherwise conflict.
>
> I am CCing:
>
> - Suren because patch 1 touches his code
> - Matthew Wilcox because he is also currently working on per-VMA
> locking stuff
> - all the maintainers/reviewers for the Kernel Memory Consistency Model
> so they can help figure out the READ_ONCE() vs smp_load_acquire()
> thing

READ_ONCE() has weaker ordering properties than smp_load_acquire().

For example, given a pointer gp:

p = whichever(gp);
a = 1;
r1 = p->b;
if ((uintptr_t)p & 0x1)
WRITE_ONCE(b, 1);
WRITE_ONCE(c, 1);

Leaving aside the "&" needed by smp_load_acquire(), if "whichever" is
"READ_ONCE", then the load from p->b and the WRITE_ONCE() to "b" are
ordered after the load from gp (the former due to an address dependency
and the latter due to a (fragile) control dependency). The compiler
is within its rights to reorder the store to "a" to precede the load
from gp. The compiler is forbidden from reordering the store to "c"
wtih the load from gp (because both are volatile accesses), but the CPU
is completely within its rights to do this reordering.

But if "whichever" is "smp_load_acquire()", all four of the subsequent
memory accesses are ordered after the load from gp.

Similarly, for WRITE_ONCE() and smp_store_release():

p = READ_ONCE(gp);
r1 = READ_ONCE(gi);
r2 = READ_ONCE(gj);
a = 1;
WRITE_ONCE(b, 1);
if (r1 & 0x1)
whichever(p->q, r2);

Again leaving aside the "&" needed by smp_store_release(), if "whichever"
is WRITE_ONCE(), then the load from gp, the load from gi, and the load
from gj are all ordered before the store to p->q (by address dependency,
control dependency, and data dependency, respectively). The store to "a"
can be reordered with the store to p->q by the compiler. The store to
"b" cannot be reordered with the store to p->q by the compiler (again,
both are volatile), but the CPU is free to reorder them, especially when
whichever() is implemented as a conditional store.

But if "whichever" is "smp_store_release()", all five of the earlier
memory accesses are ordered before the store to p->q.

Does that help, or am I missing the point of your question?

Thanx, Paul

> - people involved in the previous discussion on the security list
>
>
> Jann Horn (2):
> mm: lock_vma_under_rcu() must check vma->anon_vma under vma lock
> mm: Fix anon_vma memory ordering
>
> include/linux/rmap.h | 15 ++++++++++++++-
> mm/huge_memory.c | 4 +++-
> mm/khugepaged.c | 2 +-
> mm/ksm.c | 16 +++++++++++-----
> mm/memory.c | 32 ++++++++++++++++++++------------
> mm/mmap.c | 13 ++++++++++---
> mm/rmap.c | 6 ++++--
> mm/swapfile.c | 3 ++-
> 8 files changed, 65 insertions(+), 26 deletions(-)
>
>
> base-commit: 20ea1e7d13c1b544fe67c4a8dc3943bb1ab33e6f
> --
> 2.41.0.487.g6d72f3e995-goog
>

Next message: Paul Gofman: "Re: [v3] fs/proc/task_mmu: Implement IOCTL for efficient page table scanning"
Previous message: Randy Dunlap: "Re: Request for linux-kselftest nolibc branch Inclusion in linux-next"
In reply to: Linus Torvalds: "Re: [PATCH 2/2] mm: Fix anon_vma memory ordering"
Next in thread: Jann Horn: "Re: [PATCH 0/2] fix vma->anon_vma check for per-VMA locking; fix anon_vma memory ordering"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]