Re: [PATCH v2 5/9] rust: list: add List

From: Benno Lossin
Date: Mon May 27 2024 - 06:25:56 EST


On 06.05.24 11:53, Alice Ryhl wrote:
> Add the actual linked list itself.
>
> The linked list uses the following design: The List type itself just has
> a single pointer to the first element of the list. And the actual list
> items then form a cycle. So the last item is `first->prev`.
>
> This is slightly different from the usual kernel linked list. Matching
> that exactly would amount to giving List two pointers, and having it be
> part of the cycle of items. This alternate design has the advantage that
> the cycle is never completely empty, which can reduce the number of
> branches in some cases. However, it also has the disadvantage that List
> must be pinned, which this design is trying to avoid.
>
> Having the list items form a cycle rather than having null pointers at
> the beginning/end is convenient for several reasons. For one, it lets us
> store only one pointer in List, and it simplifies the implementation of
> several functions.
>
> Unfortunately, the `remove` function that removes an arbitrary element
> from the list has to be unsafe. This is needed because there is no way
> to handle the case where you pass an element from the wrong list. For
> example, if it is the first element of some other list, then that other
> list's `first` pointer would not be updated. Similarly, it could be a
> data race if you try to remove it from two different lists in parallel.
> (There's no problem with passing `remove` an item that's not in any
> list. Additionally, other removal methods such as `pop_front` need not
> be unsafe, as they can't be used to remove items from another list.)

I would also mention that later in this patch series you introduce
cursors for the list, which can be used to safely remove arbitrary items
(although you need to iterate the list).

> Signed-off-by: Alice Ryhl <aliceryhl@xxxxxxxxxx>
> ---
> rust/kernel/list.rs | 329 +++++++++++++++++++++++++++++++++++++++++++++++-
> rust/kernel/list/arc.rs | 6 +-
> 2 files changed, 330 insertions(+), 5 deletions(-)
>
> diff --git a/rust/kernel/list.rs b/rust/kernel/list.rs
> index f2eca542e090..d0ff29a3e5d1 100644
> --- a/rust/kernel/list.rs
> +++ b/rust/kernel/list.rs
> @@ -6,6 +6,7 @@
>
> use crate::init::PinInit;
> use crate::types::Opaque;
> +use core::marker::PhantomData;
> use core::ptr;
>
> mod impl_list_item_mod;
> @@ -16,7 +17,40 @@
> impl_list_arc_safe, AtomicListArcTracker, ListArc, ListArcSafe, TryNewListArc,
> };
>
> -/// Implemented by types where a [`ListArc<Self>`] can be inserted into a `List`.
> +/// A linked list.
> +///
> +/// All elements in this linked list will be [`ListArc`] references to the value. Since a value can
> +/// only have one `ListArc` (for each pair of prev/next pointers), this ensures that the same
> +/// prev/next pointers are not used for several linked lists.
> +///
> +/// # Invariants
> +///
> +/// * If the list is empty, then `first` is null. Otherwise, `first` points at the links field of
> +/// the first element in the list.
> +/// * All prev/next pointers of items in the list are valid and form a cycle.

I think that you additionally need "The list has exclusive access to all
`prev`/`next` pointers of items in the list." or "For every item in the
list, the list owns the associated `ListArc<T, ID>`."

> +pub struct List<T: ?Sized + ListItem<ID>, const ID: u64 = 0> {
> + first: *mut ListLinksFields,
> + _ty: PhantomData<ListArc<T, ID>>,
> +}

[...]

> + /// Add the provided item to the back of the list.
> + pub fn push_back(&mut self, item: ListArc<T, ID>) {
> + let raw_item = ListArc::into_raw(item);
> + // SAFETY:
> + // * We just got `raw_item` from a `ListArc`, so it's in an `Arc`.
> + // * If this requirement is violated, then the previous caller of `prepare_to_insert`
> + // violated the safety requirement that they can't give up ownership of the `ListArc`
> + // until they call `post_remove`.
> + // * We own the `ListArc`.
> + // * Removing items from this list is always done using `remove_internal_inner`, which
> + // calls `post_remove` before giving up ownership.
> + let list_links = unsafe { T::prepare_to_insert(raw_item) };
> + // SAFETY: We have not yet called `post_remove`, so `list_links` is still valid.
> + let item = unsafe { ListLinks::fields(list_links) };
> +
> + if self.first.is_null() {
> + self.first = item;
> + // SAFETY: The caller just gave us ownership of these fields.
> + // INVARIANT: A linked list with one item should be cyclic.
> + unsafe {
> + (*item).next = item;
> + (*item).prev = item;
> + }
> + } else {
> + let next = self.first;
> + // SAFETY: By the type invariant, this pointer is valid or null. We just checked that
> + // it's not null, so it must be valid.
> + let prev = unsafe { (*next).prev };
> + // SAFETY: Pointers in a linked list are never dangling, and the caller just gave us
> + // ownership of the fields on `item`.

Here you need that new invariant: the list needs exclusive access to all
of the `next`/`prev` pointers.

> + // INVARIANT: This correctly inserts `item` between `prev` and `next`.
> + unsafe {
> + (*item).next = next;
> + (*item).prev = prev;
> + (*prev).next = item;
> + (*next).prev = item;
> + }
> + }
> + }
> +
> + /// Add the provided item to the front of the list.
> + pub fn push_front(&mut self, item: ListArc<T, ID>) {
> + let raw_item = ListArc::into_raw(item);
> + // SAFETY:
> + // * We just got `raw_item` from a `ListArc`, so it's in an `Arc`.
> + // * If this requirement is violated, then the previous caller of `prepare_to_insert`
> + // violated the safety requirement that they can't give up ownership of the `ListArc`
> + // until they call `post_remove`.
> + // * We own the `ListArc`.
> + // * Removing items from this list is always done using `remove_internal_inner`, which
> + // calls `post_remove` before giving up ownership.
> + let list_links = unsafe { T::prepare_to_insert(raw_item) };
> + // SAFETY: We have not yet called `post_remove`, so `list_links` is still valid.
> + let item = unsafe { ListLinks::fields(list_links) };
> +
> + if self.first.is_null() {
> + // SAFETY: The caller just gave us ownership of these fields.
> + // INVARIANT: A linked list with one item should be cyclic.
> + unsafe {
> + (*item).next = item;
> + (*item).prev = item;
> + }
> + } else {
> + let next = self.first;
> + // SAFETY: We just checked that `next` is non-null.
> + let prev = unsafe { (*next).prev };
> + // SAFETY: Pointers in a linked list are never dangling, and the caller just gave us
> + // ownership of the fields on `item`.
> + // INVARIANT: This correctly inserts `item` between `prev` and `next`.
> + unsafe {
> + (*item).next = next;
> + (*item).prev = prev;
> + (*prev).next = item;
> + (*next).prev = item;
> + }
> + }

This code is the same as in `push_back`, can you refactor it?

> + self.first = item;
> + }
> +
> + /// Removes the last item from this list.
> + pub fn pop_back(&mut self) -> Option<ListArc<T, ID>> {
> + if self.first.is_null() {
> + return None;
> + }
> +
> + // SAFETY: We just checked that the list is not empty.
> + let last = unsafe { (*self.first).prev };
> + // SAFETY: The last item of this list is in this list.
> + Some(unsafe { self.remove_internal(last) })
> + }
> +
> + /// Removes the first item from this list.
> + pub fn pop_front(&mut self) -> Option<ListArc<T, ID>> {
> + if self.first.is_null() {
> + return None;
> + }
> +
> + // SAFETY: The first item of this list is in this list.
> + Some(unsafe { self.remove_internal(self.first) })
> + }
> +
> + /// Removes the provided item from this list and returns it.
> + ///
> + /// This returns `None` if the item is not in the list. (Note that by the safety requirements,
> + /// this means that the item is not in any list.)
> + ///
> + /// # Safety
> + ///
> + /// The provided item must not be in a different linked list (with the same id).

"`item` must not be ..." also other instances below.

---
Cheers,
Benno

> + pub unsafe fn remove(&mut self, item: &T) -> Option<ListArc<T, ID>> {
> + let mut item = unsafe { ListLinks::fields(T::view_links(item)) };
> + // SAFETY: The user provided a reference, and reference are never dangling.
> + //
> + // As for why this is not a data race, there are two cases:
> + //
> + // * If `item` is not in any list, then these fields are read-only and null.
> + // * If `item` is in this list, then we have exclusive access to these fields since we
> + // have a mutable reference to the list.
> + //
> + // In either case, there's no race.
> + let ListLinksFields { next, prev } = unsafe { *item };
> +
> + debug_assert_eq!(next.is_null(), prev.is_null());
> + if !next.is_null() {
> + // This is really a no-op, but this ensures that `item` is a raw pointer that was
> + // obtained without going through a pointer->reference->pointer conversion rountrip.
> + // This ensures that the list is valid under the more restrictive strict provenance
> + // ruleset.
> + //
> + // SAFETY: We just checked that `next` is not null, and it's not dangling by the
> + // list invariants.
> + unsafe {
> + debug_assert_eq!(item, (*next).prev);
> + item = (*next).prev;
> + }
> +
> + // SAFETY: We just checked that `item` is in a list, so the caller guarantees that it
> + // is in this list. The pointers are in the right order.
> + Some(unsafe { self.remove_internal_inner(item, next, prev) })
> + } else {
> + None
> + }
> + }

[...]