Re: [PATCH v11 2/8] mm: rust: add vm_area_struct methods that require read access

From: Andreas Hindborg
Date: Mon Dec 16 2024 - 09:53:43 EST



Hi Alice,

In general, can we avoid the `as _` casts? If not, could you elaborate
why they are the right choice here, rather than `try_into`?

Other comments inline below.

"Alice Ryhl" <aliceryhl@xxxxxxxxxx> writes:

> This adds a type called VmAreaRef which is used when referencing a vma
> that you have read access to. Here, read access means that you hold
> either the mmap read lock or the vma read lock (or stronger).
>
> Additionally, a vma_lookup method is added to the mmap read guard, which
> enables you to obtain a &VmAreaRef in safe Rust code.
>
> This patch only provides a way to lock the mmap read lock, but a
> follow-up patch also provides a way to just lock the vma read lock.
>
> Acked-by: Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx> (for mm bits)
> Reviewed-by: Jann Horn <jannh@xxxxxxxxxx>
> Signed-off-by: Alice Ryhl <aliceryhl@xxxxxxxxxx>
> ---
> rust/helpers/mm.c | 6 ++
> rust/kernel/mm.rs | 21 ++++++
> rust/kernel/mm/virt.rs | 191 +++++++++++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 218 insertions(+)
>

[cut]

> diff --git a/rust/kernel/mm/virt.rs b/rust/kernel/mm/virt.rs
> new file mode 100644
> index 000000000000..68c763169cf0
> --- /dev/null
> +++ b/rust/kernel/mm/virt.rs
> @@ -0,0 +1,191 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +// Copyright (C) 2024 Google LLC.
> +
> +//! Virtual memory.

Could you add a bit more context here?

> +
> +use crate::{bindings, mm::MmWithUser, types::Opaque};
> +
> +/// A wrapper for the kernel's `struct vm_area_struct` with read access.
> +///
> +/// It represents an area of virtual memory.
> +///
> +/// # Invariants
> +///
> +/// The caller must hold the mmap read lock or the vma read lock.
> +#[repr(transparent)]
> +pub struct VmAreaRef {
> + vma: Opaque<bindings::vm_area_struct>,
> +}
> +
> +// Methods you can call when holding the mmap or vma read lock (or
> strong). They must be usable no

typo "strong".

> +// matter what the vma flags are.
> +impl VmAreaRef {
> + /// Access a virtual memory area given a raw pointer.
> + ///
> + /// # Safety
> + ///
> + /// Callers must ensure that `vma` is valid for the duration of 'a, and that the mmap or vma
> + /// read lock (or stronger) is held for at least the duration of 'a.
> + #[inline]
> + pub unsafe fn from_raw<'a>(vma: *const bindings::vm_area_struct) -> &'a Self {
> + // SAFETY: The caller ensures that the invariants are satisfied for the duration of 'a.
> + unsafe { &*vma.cast() }
> + }
> +
> + /// Returns a raw pointer to this area.
> + #[inline]
> + pub fn as_ptr(&self) -> *mut bindings::vm_area_struct {
> + self.vma.get()
> + }
> +
> + /// Access the underlying `mm_struct`.
> + #[inline]
> + pub fn mm(&self) -> &MmWithUser {
> + // SAFETY: By the type invariants, this `vm_area_struct` is valid and we hold the mmap/vma
> + // read lock or stronger. This implies that the underlying mm has a non-zero value of
> + // `mm_users`.
> + unsafe { MmWithUser::from_raw((*self.as_ptr()).vm_mm) }
> + }
> +
> + /// Returns the flags associated with the virtual memory area.
> + ///
> + /// The possible flags are a combination of the constants in [`flags`].
> + #[inline]
> + pub fn flags(&self) -> vm_flags_t {
> + // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this
> + // access is not a data race.
> + unsafe { (*self.as_ptr()).__bindgen_anon_2.vm_flags as _ }
> + }
> +
> + /// Returns the (inclusive) start address of the virtual memory area.
> + #[inline]
> + pub fn start(&self) -> usize {
> + // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this
> + // access is not a data race.
> + unsafe { (*self.as_ptr()).__bindgen_anon_1.__bindgen_anon_1.vm_start as _ }
> + }
> +
> + /// Returns the (exclusive) end address of the virtual memory area.
> + #[inline]
> + pub fn end(&self) -> usize {
> + // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this
> + // access is not a data race.
> + unsafe { (*self.as_ptr()).__bindgen_anon_1.__bindgen_anon_1.vm_end as _ }
> + }
> +
> + /// Zap pages in the given page range.
> + ///
> + /// This clears page table mappings for the range at the leaf level, leaving all other page
> + /// tables intact,

I don't fully understand this docstring. Is it correct that the function
will unmap the address range given by `start` and `size`, _and_ free the
pages used to hold the mappings at the leaf level of the page table?

> and freeing any memory referenced by the VMA in this range. That is,
> + /// anonymous memory is completely freed, file-backed memory has its reference count on page
> + /// cache folio's dropped, any dirty data will still be written back to disk as usual.
> + #[inline]
> + pub fn zap_page_range_single(&self, address: usize, size: usize) {


Best regards,
Andreas Hindborg