Re: [PATCH 5/9] rust: list: add List

From: Benno Lossin
Date: Thu Apr 04 2024 - 10:04:16 EST


On 02.04.24 14:17, Alice Ryhl wrote:
> Add the actual linked list itself.
>
> The linked list uses the following design: The List type itself just has
> a single pointer to the first element of the list. And the actual list
> items then form a cycle. So the last item is `first->prev`.
>
> This is slightly different from the usual kernel linked list. Matching
> that exactly would amount to giving List two pointers, and having it be
> part of the cycle of items. This alternate design has the advantage that
> the cycle is never completely empty, which can reduce the number of
> branches in some cases. However, it also has the disadvantage that List
> must be pinned, which this design is trying to avoid.
>
> Having the list items form a cycle rather than having null pointers at
> the beginning/end is convenient for several reasons. For one, it lets us
> store only one pointer in List, and it simplifies the implementation of
> several functions.
>
> Unfortunately, the `remove` function that removes an arbitrary element
> from the list has to be unsafe. This is needed because there is no way
> to handle the case where you pass an element from the wrong list. For
> example, if it is the first element of some other list, then that other
> list's `first` pointer would not be updated. Similarly, it could be a
> data race if you try to remove it from two different lists in parallel.
> (There's no problem with passing `remove` an item that's not in any
> list. Additionally, other removal methods such as `pop_front` need not
> be unsafe, as they can't be used to remove items from another list.)
>
> Signed-off-by: Alice Ryhl <aliceryhl@xxxxxxxxxx>
> ---
> rust/kernel/list.rs | 294 +++++++++++++++++++++++++++++++++++++++++++++++-
> rust/kernel/list/arc.rs | 6 +-
> 2 files changed, 295 insertions(+), 5 deletions(-)
>
> diff --git a/rust/kernel/list.rs b/rust/kernel/list.rs
> index 7af5109500f2..7e9ed802b26b 100644
> --- a/rust/kernel/list.rs
> +++ b/rust/kernel/list.rs
> @@ -6,6 +6,7 @@
>
> use crate::init::PinInit;
> use crate::types::Opaque;
> +use core::marker::PhantomData;
> use core::ptr;
>
> mod impl_list_item_mod;
> @@ -16,7 +17,41 @@
> impl_list_arc_safe, AtomicListArcTracker, ListArc, ListArcSafe, TryNewListArc,
> };
>
> -/// Implemented by types where a [`ListArc<Self>`] can be inserted into a `List`.
> +/// A linked list.
> +///
> +/// All elements in this linked list will be [`ListArc`] references to the value. Since a value can
> +/// only have one `ListArc` (for each pair of prev/next pointers), this ensures that the same
> +/// prev/next pointers are not used for several linked lists.
> +///
> +/// # Invariants
> +///
> +/// If the list is empty, then `first` is null. Otherwise, it points at the links field of the
> +/// first element of this list. The prev/next pointers of items in the list will always form a
> +/// cycle. This means that prev/next pointers for an item in a list are never null and never
> +/// dangling.

I think this is missing that all elements of the list are in `ListArc`s.

About "This means that prev/next pointers for an item in a list are
never null and never dangling.", I think it would be simpler to say that
"All prev/next pointers of items in the list are valid and form a cycle."

> +pub struct List<T: ?Sized + ListItem<ID>, const ID: u64 = 0> {
> + first: *mut ListLinksFields,
> + _ty: PhantomData<ListArc<T, ID>>,
> +}
> +
> +// SAFETY: This is a container of `ListArc<T, ID>`, and access to the container allows the same
> +// type of access to the `ListArc<T, ID>` elements.
> +unsafe impl<T, const ID: u64> Send for List<T, ID>
> +where
> + ListArc<T, ID>: Send,
> + T: ?Sized + ListItem<ID>,
> +{
> +}
> +// SAFETY: This is a container of `ListArc<T, ID>`, and access to the container allows the same
> +// type of access to the `ListArc<T, ID>` elements.
> +unsafe impl<T, const ID: u64> Sync for List<T, ID>
> +where
> + ListArc<T, ID>: Sync,
> + T: ?Sized + ListItem<ID>,
> +{
> +}
> +
> +/// Implemented by types where a [`ListArc<Self>`] can be inserted into a [`List`].
> ///
> /// # Safety
> ///
> @@ -56,7 +91,7 @@ pub unsafe trait ListItem<const ID: u64 = 0>: ListArcSafe<ID> {
> /// been called.
> unsafe fn view_value(me: *mut ListLinks<ID>) -> *const Self;
>
> - /// This is called when an item is inserted into a `List`.
> + /// This is called when an item is inserted into a [`List`].
> ///
> /// # Guarantees
> ///
> @@ -103,7 +138,6 @@ struct ListLinksFields {
> /// The fields are null if and only if this item is not in a list.
> #[repr(transparent)]
> pub struct ListLinks<const ID: u64 = 0> {
> - #[allow(dead_code)]
> inner: Opaque<ListLinksFields>,
> }
>
> @@ -125,4 +159,258 @@ pub fn new() -> impl PinInit<Self> {
> }),
> }
> }
> +
> + /// # Safety
> + ///
> + /// The pointer must be dereferencable.

I would use "`me` is valid.".

> + #[inline]
> + unsafe fn fields(me: *mut Self) -> *mut ListLinksFields {
> + // SAFETY: The caller promises that the pointer is valid.
> + unsafe { Opaque::raw_get(ptr::addr_of!((*me).inner)) }
> + }
> +
> + /// # Safety
> + ///
> + /// The pointer must be dereferencable.

Ditto.

> + #[inline]
> + unsafe fn from_fields(me: *mut ListLinksFields) -> *mut Self {
> + me.cast()
> + }
> +}
> +
> +impl<T: ?Sized + ListItem<ID>, const ID: u64> List<T, ID> {
> + /// Creates a new empty list.
> + pub const fn new() -> Self {
> + Self {
> + first: ptr::null_mut(),
> + _ty: PhantomData,
> + }
> + }
> +
> + /// Returns whether this list is empty.
> + pub fn is_empty(&self) -> bool {
> + self.first.is_null()
> + }
> +
> + /// Add the provided item to the back of the list.
> + pub fn push_back(&mut self, item: ListArc<T, ID>) {
> + let item = unsafe { ListLinks::fields(T::prepare_to_insert(ListArc::into_raw(item))) };

Missing SAFETY comment.

> +
> + if self.first.is_null() {
> + self.first = item;
> + // SAFETY: The caller just gave us ownership of these fields.
> + // INVARIANT: A linked list with one item should be cyclic.
> + unsafe {
> + (*item).next = item;
> + (*item).prev = item;
> + }
> + } else {
> + let next = self.first;
> + // SAFETY: We just checked that `next` is non-null.

Missing mention of the type invariant.

> + let prev = unsafe { (*next).prev };
> + // SAFETY: Pointers in a linked list are never dangling, and the caller just gave us
> + // ownership of the fields on `item`.
> + // INVARIANT: This correctly inserts `item` between `prev` and `next`.
> + unsafe {
> + (*item).next = next;
> + (*item).prev = prev;
> + (*prev).next = item;
> + (*next).prev = item;
> + }
> + }
> + }
> +
> + /// Add the provided item to the front of the list.
> + pub fn push_front(&mut self, item: ListArc<T, ID>) {
> + let item = unsafe { ListLinks::fields(T::prepare_to_insert(ListArc::into_raw(item))) };
> +
> + if self.first.is_null() {
> + // SAFETY: The caller just gave us ownership of these fields.
> + // INVARIANT: A linked list with one item should be cyclic.
> + unsafe {
> + (*item).next = item;
> + (*item).prev = item;
> + }
> + } else {
> + let next = self.first;
> + // SAFETY: We just checked that `next` is non-null.
> + let prev = unsafe { (*next).prev };
> + // SAFETY: Pointers in a linked list are never dangling, and the caller just gave us
> + // ownership of the fields on `item`.
> + // INVARIANT: This correctly inserts `item` between `prev` and `next`.
> + unsafe {
> + (*item).next = next;
> + (*item).prev = prev;
> + (*prev).next = item;
> + (*next).prev = item;
> + }
> + }
> + self.first = item;
> + }
> +
> + /// Removes the last item from this list.
> + pub fn pop_back(&mut self) -> Option<ListArc<T, ID>> {
> + if self.first.is_null() {
> + return None;
> + }
> +
> + // SAFETY: We just checked that the list is not empty.
> + let last = unsafe { (*self.first).prev };
> + // SAFETY: The last item of this list is in this list.
> + Some(unsafe { self.remove_internal(last) })
> + }
> +
> + /// Removes the first item from this list.
> + pub fn pop_front(&mut self) -> Option<ListArc<T, ID>> {
> + if self.first.is_null() {
> + return None;
> + }
> +
> + // SAFETY: The first item of this list is in this list.
> + Some(unsafe { self.remove_internal(self.first) })
> + }
> +
> + /// Removes the provided item from this list and returns it.
> + ///
> + /// This returns `None` if the item is not in the list.

I think this should say "Returns `None` if the item is not in a list.".
(Technically it should be "is not in a `List<T, ID>`", since it *can* be
in another list with a different ID.)

> + ///
> + /// # Safety
> + ///
> + /// The provided item must not be in a different linked list.
> + pub unsafe fn remove(&mut self, item: &T) -> Option<ListArc<T, ID>> {
> + let mut item = unsafe { ListLinks::fields(T::view_links(item)) };
> + // SAFETY: The user provided a reference, and reference are never dangling.
> + //
> + // As for why this is not a data race, there are two cases:
> + //
> + // * If `item` is not in any list, then these fields are read-only and null.
> + // * If `item` is in this list, then we have exclusive access to these fields since we
> + // have a mutable reference to the list.
> + //
> + // In either case, there's no race.
> + let ListLinksFields { next, prev } = unsafe { *item };
> +
> + debug_assert_eq!(next.is_null(), prev.is_null());
> + if !next.is_null() {
> + // This is really a no-op, but this ensures that `item` is a raw pointer that was
> + // obtained without going through a pointer->reference->pointer conversion rountrip.
> + // This ensures that the list is valid under the more restrictive strict provenance
> + // ruleset.
> + //
> + // SAFETY: We just checked that `next` is not null, and it's not dangling by the
> + // list invariants.
> + unsafe {
> + debug_assert_eq!(item, (*next).prev);
> + item = (*next).prev;
> + }
> +
> + // SAFETY: We just checked that `item` is in a list, so the caller guarantees that it
> + // is in this list. The pointers are in the right order.
> + Some(unsafe { self.remove_internal_inner(item, next, prev) })
> + } else {
> + None
> + }
> + }
> +
> + /// Removes the provided item from the list.
> + ///
> + /// # Safety
> + ///
> + /// The pointer must point at an item in this list.
> + unsafe fn remove_internal(&mut self, item: *mut ListLinksFields) -> ListArc<T, ID> {
> + // SAFETY: The caller promises that this pointer is not dangling, and there's no data race
> + // since we have a mutable reference to the list containing `item`.
> + let ListLinksFields { next, prev } = unsafe { *item };
> + // SAFETY: The pointers are ok and in the right order.
> + unsafe { self.remove_internal_inner(item, next, prev) }
> + }
> +
> + /// Removes the provided item from the list.
> + ///
> + /// # Safety
> + ///
> + /// The `item` pointer must point at an item in this list, and we must have `(*item).next ==
> + /// next` and `(*item).prev == prev`.
> + unsafe fn remove_internal_inner(
> + &mut self,
> + item: *mut ListLinksFields,
> + next: *mut ListLinksFields,
> + prev: *mut ListLinksFields,
> + ) -> ListArc<T, ID> {
> + // SAFETY: We have exclusive access to items in the list, and prev/next pointers are

I think you mean that you have exclusive access to the prev/next fields
of the `ListLinks` associated with `ID`... But that is rather long.
Does anyone have any idea to shorten this?

> + // never null for items in a list.
> + //
> + // INVARIANT: There are three cases:
> + // * If the list has at least three items, then after removing the item, `prev` and `next`
> + // will be next to each other.
> + // * If the list has two items, then the remaining item will point at itself.
> + // * If the list has one item, then `next == prev == item`, so these writes have no effect
> + // due to the writes to `item` below.

I think the writes do not have an effect. (no need to reference the
writes to `item` below)

> + unsafe {
> + (*next).prev = prev;
> + (*prev).next = next;
> + }
> + // SAFETY: We have exclusive access to items in the list.
> + // INVARIANT: The item is no longer in a list, so the pointers should be null.
> + unsafe {
> + (*item).prev = ptr::null_mut();
> + (*item).next = ptr::null_mut();
> + }
> + // INVARIANT: There are three cases:
> + // * If `item` was not the first item, then `self.first` should remain unchanged.
> + // * If `item` was the first item and there is another item, then we just updated
> + // `prev->next` to `next`, which is the new first item, and setting `item->next` to null
> + // did not modify `prev->next`.
> + // * If `item` was the only item in the list, then `prev == item`, and we just set
> + // `item->next` to null, so this correctly sets `first` to null now that the list is
> + // empty.
> + if self.first == item {
> + // SAFETY: The `prev` field of an item in a list is never dangling.

I don't think this SAFETY comment makes sense.

--
Cheers,
Benno

> + self.first = unsafe { (*prev).next };
> + }
> +
> + // SAFETY: We just removed a `ListArc` from the list, so we can turn it back into a
> + // `ListArc`.
> + unsafe { ListArc::from_raw(T::post_remove(ListLinks::from_fields(item))) }
> + }
> +
> + /// Moves all items from `other` into `self`.
> + ///
> + /// The items of `other` are added to the back of `self`, so the last item of `other` becomes
> + /// the last item of `self`.
> + pub fn push_all_back(&mut self, other: &mut List<T, ID>) {
> + // First, we insert the elements into `self`. At the end, we make `other` empty.
> + if self.is_empty() {
> + // INVARIANT: All of the elements in `other` become elements of `self`.
> + self.first = other.first;
> + } else if !other.is_empty() {
> + let other_first = other.first;
> + // SAFETY: The other list is not empty, so this pointer is valid.
> + let other_last = unsafe { (*other_first).prev };
> + let self_first = self.first;
> + // SAFETY: The self list is not empty, so this pointer is valid.
> + let self_last = unsafe { (*self_first).prev };
> +
> + // SAFETY: We have exclusive access to both lists, so we can update the pointers.
> + // INVARIANT: This correctly sets the pointers to merge both lists. We do not need to
> + // update `self.first` because the first element of `self` does not change.
> + unsafe {
> + (*self_first).prev = other_last;
> + (*other_last).next = self_first;
> + (*self_last).next = other_first;
> + (*other_first).prev = self_last;
> + }
> + }
> +
> + // INVARIANT: The other list is now empty, so update its pointer.
> + other.first = ptr::null_mut();
> + }
> +}
> +
> +impl<T: ?Sized + ListItem<ID>, const ID: u64> Drop for List<T, ID> {
> + fn drop(&mut self) {
> + while let Some(item) = self.pop_front() {
> + drop(item);
> + }
> + }
> }