Re: [PATCH v2] rust: mm: add abstractions for mm_struct and vm_area_struct

From: Alice Ryhl
Date: Tue Jul 30 2024 - 07:05:21 EST


On Mon, Jul 29, 2024 at 6:13 PM Benno Lossin <benno.lossin@xxxxxxxxx> wrote:
>
> On 27.07.24 11:03, Alice Ryhl wrote:
> > +/// A wrapper for the kernel's `struct mm_struct`.
> > +///
> > +/// Since `mm_users` may be zero, the associated address space may not exist anymore. You must use
> > +/// [`mmget_not_zero`] before accessing the address space.
> > +///
> > +/// The `ARef<Mm>` smart pointer holds an `mmgrab` refcount. Its destructor may sleep.
> > +///
> > +/// # Invariants
> > +///
> > +/// Values of this type are always refcounted.
> > +///
> > +/// [`mmget_not_zero`]: Mm::mmget_not_zero
> > +pub struct Mm {
> > + mm: Opaque<bindings::mm_struct>,
> > +}
> > +
> > +/// A wrapper for the kernel's `struct mm_struct`.
> > +///
> > +/// This type is used only when `mm_users` is known to be non-zero at compile-time. It can be used
> > +/// to access the associated address space.
> > +///
> > +/// The `ARef<MmWithUser>` smart pointer holds an `mmget` refcount. Its destructor may sleep.
> > +///
> > +/// # Invariants
> > +///
> > +/// Values of this type are always refcounted. The value of `mm_users` is non-zero.
> > +#[repr(transparent)]
> > +pub struct MmWithUser {
> > + mm: Mm,
> > +}
>
> I personally wouldn't sort it this way (so struct decls, methods and
> then AlwaysRefCounted impl), but I would sort it first by the struct.
> I find it helpful to have the `AlwaysRefCounted` impl close to the
> struct declaration (similarly for `Drop`). But that might just be me.

I can reorder for next version.

> > +
> > +/// Equivalent to `ARef<MmWithUser>` but uses `mmput_async` in destructor.
> > +///
> > +/// The destructor of this type will never sleep.
> > +///
> > +/// # Invariants
> > +///
> > +/// `inner` points to a valid `mm_struct` and the `ARefMmWithUserAsync` owns an `mmget` refcount.
> > +pub struct ARefMmWithUserAsync {
> > + inner: NonNull<bindings::mm_struct>,
>
> I am confused, why doesn't `mm: MM` work here? I.e. also allow usage of
> `ARef<MmWithUserAsync>`.

We could do that, but I don't know how much sense it makes. With Mm
and MmWithUser there's a legitimate distinction between them that
makes sense regardless of whether it's behind an ARef or &. But with
the `mmput_async` case, the distinction only makes sense for ARef
pointers, and &MmWithUser and &MmWithUserAsync would be 100%
interchangeable.

That is to say, this is a property of the pointer, not the pointee. I
don't think it makes sense semantically to have it be a wrapper around
MmWithUser.

> Another approach might be to have the function on `MmWithUser`:
>
> fn put_async(this: ARef<Self>)
>
> Or do you need it to be done on drop?

This needs to happen in drop so that use of the question mark
operation doesn't suddenly result in sleep-in-atomic-ctx bugs.

> > +}
> > +
> > +// Make all `Mm` methods available on `MmWithUser`.
> > +impl Deref for MmWithUser {
> > + type Target = Mm;
> > +
> > + #[inline]
> > + fn deref(&self) -> &Mm {
> > + &self.mm
> > + }
>
> Does it really make sense to expose every function? E.g.
> `mmget_not_zero` would always succeed, right?

I don't think it's a problem. Right now it exposes mmget_not_zero,
is_same_mm, and as_raw. The only one where it doesn't make much sense
is mmget_not_zero, but I don't think it hurts either.

> > +}
> > +
> > +// These methods are safe to call even if `mm_users` is zero.
>
> [...]
>
> > diff --git a/rust/kernel/mm/virt.rs b/rust/kernel/mm/virt.rs
> > new file mode 100644
> > index 000000000000..2e97ef1eac58
> > --- /dev/null
> > +++ b/rust/kernel/mm/virt.rs
> > @@ -0,0 +1,199 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +// Copyright (C) 2024 Google LLC.
> > +
> > +//! Virtual memory.
> > +
> > +use crate::{
> > + bindings,
> > + error::{to_result, Result},
> > + page::Page,
> > + types::Opaque,
> > +};
> > +
> > +/// A wrapper for the kernel's `struct vm_area_struct`.
> > +///
> > +/// It represents an area of virtual memory.
> > +#[repr(transparent)]
> > +pub struct VmArea {
> > + vma: Opaque<bindings::vm_area_struct>,
> > +}
> > +
> > +impl VmArea {
> > + /// Access a virtual memory area given a raw pointer.
> > + ///
> > + /// # Safety
> > + ///
> > + /// Callers must ensure that `vma` is valid for the duration of 'a, with shared access. The
> > + /// caller must ensure that using the pointer for immutable operations is okay.
>
> Nothing here states that the pointee is not allowed to be changed,
> unless you mean that by "shared access" which would not match my
> definition.

How about this?

Callers must ensure that:
* `vma` is valid for the duration of 'a.
* the caller holds the mmap read lock for 'a.

And `from_raw_vma_mut` would instead require the caller to hold the
mmap write lock.

> > + #[inline]
> > + pub unsafe fn from_raw_vma<'a>(vma: *const bindings::vm_area_struct) -> &'a Self {
> > + // SAFETY: The caller ensures that the pointer is valid.
> > + unsafe { &*vma.cast() }
> > + }
> > +
> > + /// Access a virtual memory area given a raw pointer.
> > + ///
> > + /// # Safety
> > + ///
> > + /// Callers must ensure that `vma` is valid for the duration of 'a, with exclusive access. The
> > + /// caller must ensure that using the pointer for immutable and mutable operations is okay.
> > + #[inline]
> > + pub unsafe fn from_raw_vma_mut<'a>(vma: *mut bindings::vm_area_struct) -> &'a mut Self {
> > + // SAFETY: The caller ensures that the pointer is valid.
> > + unsafe { &mut *vma.cast() }
> > + }
> > +
> > + /// Returns a raw pointer to this area.
> > + #[inline]
> > + pub fn as_ptr(&self) -> *mut bindings::vm_area_struct {
> > + self.vma.get()
> > + }
> > +
> > + /// Returns the flags associated with the virtual memory area.
> > + ///
> > + /// The possible flags are a combination of the constants in [`flags`].
> > + #[inline]
> > + pub fn flags(&self) -> usize {
> > + // SAFETY: The pointer is valid since self is a reference. The field is valid for reading
> > + // given a shared reference.
>
> Why is the field not changed from the C side? Is this part readonly?

Because we hold the mmap read lock. (or the write lock)

Alice