Re: [PATCH v6 1/2] rust: mm: add abstractions for mm_struct and vm_area_struct

From: Boqun Feng
Date: Sat Oct 12 2024 - 20:16:37 EST


On Thu, Oct 10, 2024 at 12:56:35PM +0000, Alice Ryhl wrote:
> These abstractions allow you to manipulate vmas. Rust Binder will uses
> these in a few different ways.
>
> In the mmap implementation, a VmAreaNew will be provided to the mmap
> call which allows it to modify the vma in ways that are only okay during
> initial setup. This is the case where the most methods are available.
>
> However, Rust Binder needs to insert and remove pages from the vma as
> time passes. When incoming messages arrive, pages may need to be
> inserted if space is missing, and in this case that is done by using a
> stashed ARef<Mm> and calling mmget_not_zero followed by mmap_write_lock
> followed by vma_lookup followed by vm_insert_page. In this case, since
> mmap_write_lock is used, the VmAreaMut type will be in use.
>
> Another use-case is the shrinker, where the mmap read lock is taken
> instead, and zap_page_range_single is used to remove pages from the vma.
> In this case, only the read lock is taken, so the VmAreaRef type will be
> in use.
>
> Future extensions could involve a VmAreaRcuRef for accessing vma methods
> that are okay to use when holding just the rcu read lock. However, these
> methods are not needed from Rust yet.
>
> This uses shared references even for VmAreaMut. This is preferable to
> using pinned mutable references because those are pretty inconvenient
> due to the lack of reborrowing. However, this means that VmAreaMut
> cannot be Sync. I think it is an acceptable trade-off.
>

Interesting ;-) I agree it's better than using Pin.

> This patch is based on Wedson's implementation on the old rust branch,
> but has been changed significantly. All mistakes are Alice's.
>
> Co-developed-by: Wedson Almeida Filho <wedsonaf@xxxxxxxxx>
> Signed-off-by: Wedson Almeida Filho <wedsonaf@xxxxxxxxx>
> Signed-off-by: Alice Ryhl <aliceryhl@xxxxxxxxxx>
> ---
[...]
> +/// A wrapper for the kernel's `struct mm_struct`.
> +///
> +/// This type is like [`Mm`], but with non-zero `mm_users`. It can only be used when `mm_users` can
> +/// be proven to be non-zero at compile-time, usually because the relevant code holds an `mmget`
> +/// refcount. It can be used to access the associated address space.
> +///
> +/// The `ARef<MmWithUser>` smart pointer holds an `mmget` refcount. Its destructor may sleep.
> +///
> +/// # Invariants
> +///
> +/// Values of this type are always refcounted using `mmget`. The value of `mm_users` is non-zero.
> +/// #[repr(transparent)]
> +pub struct MmWithUser {
> + mm: Mm,
> +}
> +
> +// SAFETY: It is safe to call `mmput` on another thread than where `mmget` was called.
> +unsafe impl Send for MmWithUser {}
> +// SAFETY: All methods on `MmWithUser` can be called in parallel from several threads.
> +unsafe impl Sync for MmWithUser {}
> +
[...]
> +
> +/// A guard for the mmap read lock.
> +///
> +/// # Invariants
> +///
> +/// This `MmapReadLock` guard owns the mmap read lock.
> +pub struct MmapReadLock<'a> {
> + mm: &'a MmWithUser,

Since `MmWithUser` is `Sync`, so `MmapReadLock<'a>` is `Send`? However,
it cannot be a `Send` because the lock must be released by the same
thread: although ->mmap_lock is a read-write *semaphore*, but
rw_semaphore by default has strict owner semantics (see
Documentation/locking/locktypes.rst).

Also given this type is really a lock guard, maybe name it
something like MmapReadGuard or MmapReadLockGuard?

Same `Send` issue and name suggestion for `MmapWriteLock`.

> +}
> +
> +impl<'a> MmapReadLock<'a> {
> + /// Look up a vma at the given address.
> + #[inline]
> + pub fn vma_lookup(&self, vma_addr: usize) -> Option<&virt::VmAreaRef> {
> + // SAFETY: We hold a reference to the mm, so the pointer must be valid. Any value is okay
> + // for `vma_addr`.
> + let vma = unsafe { bindings::vma_lookup(self.mm.as_raw(), vma_addr as _) };
> +
> + if vma.is_null() {
> + None
> + } else {
> + // SAFETY: We just checked that a vma was found, so the pointer is valid. Furthermore,
> + // the returned area will borrow from this read lock guard, so it can only be used
> + // while the read lock is still held.
> + unsafe { Some(virt::VmAreaRef::from_raw(vma)) }
> + }
> + }
> +}
> +
> +impl Drop for MmapReadLock<'_> {
> + #[inline]
> + fn drop(&mut self) {
> + // SAFETY: We hold the read lock by the type invariants.
> + unsafe { bindings::mmap_read_unlock(self.mm.as_raw()) };
> + }
> +}
> +
> +/// A guard for the mmap write lock.
> +///
> +/// # Invariants
> +///
> +/// This `MmapReadLock` guard owns the mmap write lock.
> +pub struct MmapWriteLock<'a> {
> + mm: &'a MmWithUser,
> +}
> +
> +impl<'a> MmapWriteLock<'a> {
> + /// Look up a vma at the given address.
> + #[inline]
> + pub fn vma_lookup(&mut self, vma_addr: usize) -> Option<&virt::VmAreaMut> {
> + // SAFETY: We hold a reference to the mm, so the pointer must be valid. Any value is okay
> + // for `vma_addr`.
> + let vma = unsafe { bindings::vma_lookup(self.mm.as_raw(), vma_addr as _) };
> +
> + if vma.is_null() {
> + None
> + } else {
> + // SAFETY: We just checked that a vma was found, so the pointer is valid. Furthermore,
> + // the returned area will borrow from this write lock guard, so it can only be used
> + // while the write lock is still held.
> + unsafe { Some(virt::VmAreaMut::from_raw(vma)) }
> + }
> + }
> +}
> +
> +impl Drop for MmapWriteLock<'_> {
> + #[inline]
> + fn drop(&mut self) {
> + // SAFETY: We hold the write lock by the type invariants.
> + unsafe { bindings::mmap_write_unlock(self.mm.as_raw()) };
> + }
> +}
> diff --git a/rust/kernel/mm/virt.rs b/rust/kernel/mm/virt.rs
> new file mode 100644
> index 000000000000..7c09813e22f9
> --- /dev/null
> +++ b/rust/kernel/mm/virt.rs
> @@ -0,0 +1,264 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +// Copyright (C) 2024 Google LLC.
> +
> +//! Virtual memory.
[...]
> +impl VmAreaRef {
> + /// Access a virtual memory area given a raw pointer.
> + ///
> + /// # Safety
> + ///
> + /// Callers must ensure that `vma` is valid for the duration of 'a, and that the mmap read lock
> + /// (or stronger) is held for at least the duration of 'a.
> + #[inline]
> + pub unsafe fn from_raw<'a>(vma: *const bindings::vm_area_struct) -> &'a Self {

Unrelated to this patch, but since we have so many `from_raw`s, I want
to suggest that we should look into a #[derive(FromRaw)] ;-) For
example:

pub trait FromRaw {
type RawType;
unsafe fn from_raw<'a>(raw: *const Self::RawType) -> &'a Self;
}
and

#[derive(FromRaw)]
#[repr(transparent)] // repr(transparent) is mandatory.
struct VmAreaRef {
vma: Opaque<bindings::vm_area_struct> // Opaque is also mandatory.
}

Regards,
Boqun

> + // SAFETY: The caller ensures that the invariants are satisfied for the duration of 'a.
> + unsafe { &*vma.cast() }
> + }
[...]