Re: [PATCH v6 1/2] rust: mm: add abstractions for mm_struct and vm_area_struct

From: Alice Ryhl
Date: Mon Oct 14 2024 - 05:34:17 EST


On Sun, Oct 13, 2024 at 2:16 AM Boqun Feng <boqun.feng@xxxxxxxxx> wrote:
>
> On Thu, Oct 10, 2024 at 12:56:35PM +0000, Alice Ryhl wrote:
> > These abstractions allow you to manipulate vmas. Rust Binder will uses
> > these in a few different ways.
> >
> > In the mmap implementation, a VmAreaNew will be provided to the mmap
> > call which allows it to modify the vma in ways that are only okay during
> > initial setup. This is the case where the most methods are available.
> >
> > However, Rust Binder needs to insert and remove pages from the vma as
> > time passes. When incoming messages arrive, pages may need to be
> > inserted if space is missing, and in this case that is done by using a
> > stashed ARef<Mm> and calling mmget_not_zero followed by mmap_write_lock
> > followed by vma_lookup followed by vm_insert_page. In this case, since
> > mmap_write_lock is used, the VmAreaMut type will be in use.
> >
> > Another use-case is the shrinker, where the mmap read lock is taken
> > instead, and zap_page_range_single is used to remove pages from the vma.
> > In this case, only the read lock is taken, so the VmAreaRef type will be
> > in use.
> >
> > Future extensions could involve a VmAreaRcuRef for accessing vma methods
> > that are okay to use when holding just the rcu read lock. However, these
> > methods are not needed from Rust yet.
> >
> > This uses shared references even for VmAreaMut. This is preferable to
> > using pinned mutable references because those are pretty inconvenient
> > due to the lack of reborrowing. However, this means that VmAreaMut
> > cannot be Sync. I think it is an acceptable trade-off.
> >
>
> Interesting ;-) I agree it's better than using Pin.
>
> > This patch is based on Wedson's implementation on the old rust branch,
> > but has been changed significantly. All mistakes are Alice's.
> >
> > Co-developed-by: Wedson Almeida Filho <wedsonaf@xxxxxxxxx>
> > Signed-off-by: Wedson Almeida Filho <wedsonaf@xxxxxxxxx>
> > Signed-off-by: Alice Ryhl <aliceryhl@xxxxxxxxxx>
> > ---
> [...]
> > +/// A wrapper for the kernel's `struct mm_struct`.
> > +///
> > +/// This type is like [`Mm`], but with non-zero `mm_users`. It can only be used when `mm_users` can
> > +/// be proven to be non-zero at compile-time, usually because the relevant code holds an `mmget`
> > +/// refcount. It can be used to access the associated address space.
> > +///
> > +/// The `ARef<MmWithUser>` smart pointer holds an `mmget` refcount. Its destructor may sleep.
> > +///
> > +/// # Invariants
> > +///
> > +/// Values of this type are always refcounted using `mmget`. The value of `mm_users` is non-zero.
> > +/// #[repr(transparent)]
> > +pub struct MmWithUser {
> > + mm: Mm,
> > +}
> > +
> > +// SAFETY: It is safe to call `mmput` on another thread than where `mmget` was called.
> > +unsafe impl Send for MmWithUser {}
> > +// SAFETY: All methods on `MmWithUser` can be called in parallel from several threads.
> > +unsafe impl Sync for MmWithUser {}
> > +
> [...]
> > +
> > +/// A guard for the mmap read lock.
> > +///
> > +/// # Invariants
> > +///
> > +/// This `MmapReadLock` guard owns the mmap read lock.
> > +pub struct MmapReadLock<'a> {
> > + mm: &'a MmWithUser,
>
> Since `MmWithUser` is `Sync`, so `MmapReadLock<'a>` is `Send`? However,
> it cannot be a `Send` because the lock must be released by the same
> thread: although ->mmap_lock is a read-write *semaphore*, but
> rw_semaphore by default has strict owner semantics (see
> Documentation/locking/locktypes.rst).
>
> Also given this type is really a lock guard, maybe name it
> something like MmapReadGuard or MmapReadLockGuard?
>
> Same `Send` issue and name suggestion for `MmapWriteLock`.

Fixed in v7.

https://lore.kernel.org/lkml/20241014-vma-v7-0-01e32f861195@xxxxxxxxxx/

> > +impl VmAreaRef {
> > + /// Access a virtual memory area given a raw pointer.
> > + ///
> > + /// # Safety
> > + ///
> > + /// Callers must ensure that `vma` is valid for the duration of 'a, and that the mmap read lock
> > + /// (or stronger) is held for at least the duration of 'a.
> > + #[inline]
> > + pub unsafe fn from_raw<'a>(vma: *const bindings::vm_area_struct) -> &'a Self {
>
> Unrelated to this patch, but since we have so many `from_raw`s, I want
> to suggest that we should look into a #[derive(FromRaw)] ;-) For
> example:
>
> pub trait FromRaw {
> type RawType;
> unsafe fn from_raw<'a>(raw: *const Self::RawType) -> &'a Self;
> }
> and
>
> #[derive(FromRaw)]
> #[repr(transparent)] // repr(transparent) is mandatory.
> struct VmAreaRef {
> vma: Opaque<bindings::vm_area_struct> // Opaque is also mandatory.
> }

Well, perhaps. But it's not a trait, so is a derive the right way?
Either way, it's unrelated to this patch.

Alice