Re: [PATCH v3 4/4] rust: add abstraction for `struct page`

From: Alice Ryhl
Date: Thu Mar 21 2024 - 09:43:23 EST


On Thu, Mar 21, 2024 at 2:16 PM Benno Lossin <benno.lossin@xxxxxxxxx> wrote:
>
> On 3/20/24 09:46, Alice Ryhl wrote:
> >> On 3/11/24 11:47, Alice Ryhl wrote:
> >>> +/// A pointer to a page that owns the page allocation.
> >>> +///
> >>> +/// # Invariants
> >>> +///
> >>> +/// The pointer points at a page, and has ownership over the page.
> >>
> >> Why not "`page` is valid"?
> >> Do you mean by ownership of the page that `page` has ownership of the
> >> allocation, or does that entail any other property/privilege?
> >
> > I can add "at a valid page".
>
> I don't think that helps, what you need as an invariant is that the
> pointer is valid.

To me "points at a page" implies that the pointer is valid. I mean, if
it was dangling, it would not point at a page?

But I can reword to something else if you have a preferred phrasing.

> >>> + /// Runs a piece of code with this page mapped to an address.
> >>> + ///
> >>> + /// The page is unmapped when this call returns.
> >>> + ///
> >>> + /// It is up to the caller to use the provided raw pointer correctly.
> >>
> >> This says nothing about what 'correctly' means. What I gathered from the
> >> implementation is that the supplied pointer is valid for the execution
> >> of `f` for `PAGE_SIZE` bytes.
> >> What other things are you allowed to rely upon?
> >>
> >> Is it really OK for this function to be called from multiple threads?
> >> Could that not result in the same page being mapped multiple times? If
> >> that is fine, what about potential data races when two threads write to
> >> the pointer given to `f`?
> >>
> >>> + pub fn with_page_mapped<T>(&self, f: impl FnOnce(*mut u8) -> T) -> T {
> >
> > I will say:
> >
> > /// It is up to the caller to use the provided raw pointer correctly.
> > /// The pointer is valid for `PAGE_SIZE` bytes and for the duration in
> > /// which the closure is called. Depending on the gfp flags and kernel
> > /// configuration, the pointer may only be mapped on the current thread,
> > /// and in those cases, dereferencing it on other threads is UB. Other
> > /// than that, the usual rules for dereferencing a raw pointer apply.
> > /// (E.g., don't cause data races, the memory may be uninitialized, and
> > /// so on.)
>
> I would simplify and drop "depending on the gfp flags and kernel..." and
> just say that the pointer is only valid on the current thread.

Sure, that works for me.

> Also would it make sense to make the pointer type *mut [u8; PAGE_SIZE]?

I think it's a trade-off. That makes the code more error-prone, since
`pointer::add` now doesn't move by a number of bytes, but a number of
pages.

> > It's okay to map it multiple times from different threads.
>
> Do you still need to take care of data races?
> So would it be fine to execute this code on two threads in parallel?
>
> static PAGE: Page = ...; // assume we have a page accessible by both threads
>
> PAGE.with_page_mapped(|ptr| {
> loop {
> unsafe { ptr.write(0) };
> pr_info!("{}", unsafe { ptr.read() });
> }
> });

Like I said, the usual pointer rules apply. Two threads can access it
in parallel as long as one of the following are satisfied:

* Both accesses are reads.
* Both accesses are atomic.
* They access disjoint byte ranges.

Other than the fact that it uses a thread-local mapping on machines
that can't address all of their memory at the same time, it's
completely normal memory. It's literally just a PAGE_SIZE-aligned
allocation of PAGE_SIZE bytes.

> If this is not allowed, I don't really like the API. As a raw version it
> would be fine, but I think we should have a safer version (eg by taking
> `&mut self`).

I don't understand what you mean. It is the *most* raw API that `Page`
has. I can make them private if you want me to. The API cannot take
`&mut self` because I need to be able to unsafely perform concurrent
writes to disjoint byte ranges.

Alice