Re: [PATCH 5/9] rust: list: add List

From: Alice Ryhl
Date: Thu Apr 04 2024 - 10:12:38 EST


On Thu, Apr 4, 2024 at 4:03 PM Benno Lossin <benno.lossin@xxxxxxxxx> wrote:
>
> On 02.04.24 14:17, Alice Ryhl wrote:
> > Add the actual linked list itself.
> >
> > The linked list uses the following design: The List type itself just has
> > a single pointer to the first element of the list. And the actual list
> > items then form a cycle. So the last item is `first->prev`.
> >
> > This is slightly different from the usual kernel linked list. Matching
> > that exactly would amount to giving List two pointers, and having it be
> > part of the cycle of items. This alternate design has the advantage that
> > the cycle is never completely empty, which can reduce the number of
> > branches in some cases. However, it also has the disadvantage that List
> > must be pinned, which this design is trying to avoid.
> >
> > Having the list items form a cycle rather than having null pointers at
> > the beginning/end is convenient for several reasons. For one, it lets us
> > store only one pointer in List, and it simplifies the implementation of
> > several functions.
> >
> > Unfortunately, the `remove` function that removes an arbitrary element
> > from the list has to be unsafe. This is needed because there is no way
> > to handle the case where you pass an element from the wrong list. For
> > example, if it is the first element of some other list, then that other
> > list's `first` pointer would not be updated. Similarly, it could be a
> > data race if you try to remove it from two different lists in parallel.
> > (There's no problem with passing `remove` an item that's not in any
> > list. Additionally, other removal methods such as `pop_front` need not
> > be unsafe, as they can't be used to remove items from another list.)
> >
> > Signed-off-by: Alice Ryhl <aliceryhl@xxxxxxxxxx>
> > ---
> > rust/kernel/list.rs | 294 +++++++++++++++++++++++++++++++++++++++++++++++-
> > rust/kernel/list/arc.rs | 6 +-
> > 2 files changed, 295 insertions(+), 5 deletions(-)
> >
> > diff --git a/rust/kernel/list.rs b/rust/kernel/list.rs
> > index 7af5109500f2..7e9ed802b26b 100644
> > --- a/rust/kernel/list.rs
> > +++ b/rust/kernel/list.rs
> > @@ -6,6 +6,7 @@
> >
> > use crate::init::PinInit;
> > use crate::types::Opaque;
> > +use core::marker::PhantomData;
> > use core::ptr;
> >
> > mod impl_list_item_mod;
> > @@ -16,7 +17,41 @@
> > impl_list_arc_safe, AtomicListArcTracker, ListArc, ListArcSafe, TryNewListArc,
> > };
> >
> > -/// Implemented by types where a [`ListArc<Self>`] can be inserted into a `List`.
> > +/// A linked list.
> > +///
> > +/// All elements in this linked list will be [`ListArc`] references to the value. Since a value can
> > +/// only have one `ListArc` (for each pair of prev/next pointers), this ensures that the same
> > +/// prev/next pointers are not used for several linked lists.
> > +///
> > +/// # Invariants
> > +///
> > +/// If the list is empty, then `first` is null. Otherwise, it points at the links field of the
> > +/// first element of this list. The prev/next pointers of items in the list will always form a
> > +/// cycle. This means that prev/next pointers for an item in a list are never null and never
> > +/// dangling.
>
> I think this is missing that all elements of the list are in `ListArc`s.
>
> About "This means that prev/next pointers for an item in a list are
> never null and never dangling.", I think it would be simpler to say that
> "All prev/next pointers of items in the list are valid and form a cycle."

Sure.

> > +pub struct List<T: ?Sized + ListItem<ID>, const ID: u64 = 0> {
> > + first: *mut ListLinksFields,
> > + _ty: PhantomData<ListArc<T, ID>>,
> > +}
> > +
> > +// SAFETY: This is a container of `ListArc<T, ID>`, and access to the container allows the same
> > +// type of access to the `ListArc<T, ID>` elements.
> > +unsafe impl<T, const ID: u64> Send for List<T, ID>
> > +where
> > + ListArc<T, ID>: Send,
> > + T: ?Sized + ListItem<ID>,
> > +{
> > +}
> > +// SAFETY: This is a container of `ListArc<T, ID>`, and access to the container allows the same
> > +// type of access to the `ListArc<T, ID>` elements.
> > +unsafe impl<T, const ID: u64> Sync for List<T, ID>
> > +where
> > + ListArc<T, ID>: Sync,
> > + T: ?Sized + ListItem<ID>,
> > +{
> > +}
> > +
> > +/// Implemented by types where a [`ListArc<Self>`] can be inserted into a [`List`].
> > ///
> > /// # Safety
> > ///
> > @@ -56,7 +91,7 @@ pub unsafe trait ListItem<const ID: u64 = 0>: ListArcSafe<ID> {
> > /// been called.
> > unsafe fn view_value(me: *mut ListLinks<ID>) -> *const Self;
> >
> > - /// This is called when an item is inserted into a `List`.
> > + /// This is called when an item is inserted into a [`List`].
> > ///
> > /// # Guarantees
> > ///
> > @@ -103,7 +138,6 @@ struct ListLinksFields {
> > /// The fields are null if and only if this item is not in a list.
> > #[repr(transparent)]
> > pub struct ListLinks<const ID: u64 = 0> {
> > - #[allow(dead_code)]
> > inner: Opaque<ListLinksFields>,
> > }
> >
> > @@ -125,4 +159,258 @@ pub fn new() -> impl PinInit<Self> {
> > }),
> > }
> > }
> > +
> > + /// # Safety
> > + ///
> > + /// The pointer must be dereferencable.
>
> I would use "`me` is valid.".

Ok.

> > + #[inline]
> > + unsafe fn fields(me: *mut Self) -> *mut ListLinksFields {
> > + // SAFETY: The caller promises that the pointer is valid.
> > + unsafe { Opaque::raw_get(ptr::addr_of!((*me).inner)) }
> > + }
> > +
> > + /// # Safety
> > + ///
> > + /// The pointer must be dereferencable.
>
> Ditto.

Ok.

> > + #[inline]
> > + unsafe fn from_fields(me: *mut ListLinksFields) -> *mut Self {
> > + me.cast()
> > + }
> > +}
> > +
> > +impl<T: ?Sized + ListItem<ID>, const ID: u64> List<T, ID> {
> > + /// Creates a new empty list.
> > + pub const fn new() -> Self {
> > + Self {
> > + first: ptr::null_mut(),
> > + _ty: PhantomData,
> > + }
> > + }
> > +
> > + /// Returns whether this list is empty.
> > + pub fn is_empty(&self) -> bool {
> > + self.first.is_null()
> > + }
> > +
> > + /// Add the provided item to the back of the list.
> > + pub fn push_back(&mut self, item: ListArc<T, ID>) {
> > + let item = unsafe { ListLinks::fields(T::prepare_to_insert(ListArc::into_raw(item))) };
>
> Missing SAFETY comment.

Will add one here and in push_front.

> > +
> > + if self.first.is_null() {
> > + self.first = item;
> > + // SAFETY: The caller just gave us ownership of these fields.
> > + // INVARIANT: A linked list with one item should be cyclic.
> > + unsafe {
> > + (*item).next = item;
> > + (*item).prev = item;
> > + }
> > + } else {
> > + let next = self.first;
> > + // SAFETY: We just checked that `next` is non-null.
>
> Missing mention of the type invariant.

SAFETY: By the type invariant, this pointer is valid or null. We just
checked that it's not null, so it must be valid.

> > + let prev = unsafe { (*next).prev };
> > + // SAFETY: Pointers in a linked list are never dangling, and the caller just gave us
> > + // ownership of the fields on `item`.
> > + // INVARIANT: This correctly inserts `item` between `prev` and `next`.
> > + unsafe {
> > + (*item).next = next;
> > + (*item).prev = prev;
> > + (*prev).next = item;
> > + (*next).prev = item;
> > + }
> > + }
> > + }
> > +
> > + /// Add the provided item to the front of the list.
> > + pub fn push_front(&mut self, item: ListArc<T, ID>) {
> > + let item = unsafe { ListLinks::fields(T::prepare_to_insert(ListArc::into_raw(item))) };
> > +
> > + if self.first.is_null() {
> > + // SAFETY: The caller just gave us ownership of these fields.
> > + // INVARIANT: A linked list with one item should be cyclic.
> > + unsafe {
> > + (*item).next = item;
> > + (*item).prev = item;
> > + }
> > + } else {
> > + let next = self.first;
> > + // SAFETY: We just checked that `next` is non-null.
> > + let prev = unsafe { (*next).prev };
> > + // SAFETY: Pointers in a linked list are never dangling, and the caller just gave us
> > + // ownership of the fields on `item`.
> > + // INVARIANT: This correctly inserts `item` between `prev` and `next`.
> > + unsafe {
> > + (*item).next = next;
> > + (*item).prev = prev;
> > + (*prev).next = item;
> > + (*next).prev = item;
> > + }
> > + }
> > + self.first = item;
> > + }
> > +
> > + /// Removes the last item from this list.
> > + pub fn pop_back(&mut self) -> Option<ListArc<T, ID>> {
> > + if self.first.is_null() {
> > + return None;
> > + }
> > +
> > + // SAFETY: We just checked that the list is not empty.
> > + let last = unsafe { (*self.first).prev };
> > + // SAFETY: The last item of this list is in this list.
> > + Some(unsafe { self.remove_internal(last) })
> > + }
> > +
> > + /// Removes the first item from this list.
> > + pub fn pop_front(&mut self) -> Option<ListArc<T, ID>> {
> > + if self.first.is_null() {
> > + return None;
> > + }
> > +
> > + // SAFETY: The first item of this list is in this list.
> > + Some(unsafe { self.remove_internal(self.first) })
> > + }
> > +
> > + /// Removes the provided item from this list and returns it.
> > + ///
> > + /// This returns `None` if the item is not in the list.
>
> I think this should say "Returns `None` if the item is not in a list.".
> (Technically it should be "is not in a `List<T, ID>`", since it *can* be
> in another list with a different ID.)

I'm not really convinced. The phrases "the list" and "a list" are
equivalent given the safety requirement for this method, but "the
list" seems more natural to me. The `remove` method of any other
collection would say "the list" too.

> > + ///
> > + /// # Safety
> > + ///
> > + /// The provided item must not be in a different linked list.
> > + pub unsafe fn remove(&mut self, item: &T) -> Option<ListArc<T, ID>> {
> > + let mut item = unsafe { ListLinks::fields(T::view_links(item)) };
> > + // SAFETY: The user provided a reference, and reference are never dangling.
> > + //
> > + // As for why this is not a data race, there are two cases:
> > + //
> > + // * If `item` is not in any list, then these fields are read-only and null.
> > + // * If `item` is in this list, then we have exclusive access to these fields since we
> > + // have a mutable reference to the list.
> > + //
> > + // In either case, there's no race.
> > + let ListLinksFields { next, prev } = unsafe { *item };
> > +
> > + debug_assert_eq!(next.is_null(), prev.is_null());
> > + if !next.is_null() {
> > + // This is really a no-op, but this ensures that `item` is a raw pointer that was
> > + // obtained without going through a pointer->reference->pointer conversion rountrip.
> > + // This ensures that the list is valid under the more restrictive strict provenance
> > + // ruleset.
> > + //
> > + // SAFETY: We just checked that `next` is not null, and it's not dangling by the
> > + // list invariants.
> > + unsafe {
> > + debug_assert_eq!(item, (*next).prev);
> > + item = (*next).prev;
> > + }
> > +
> > + // SAFETY: We just checked that `item` is in a list, so the caller guarantees that it
> > + // is in this list. The pointers are in the right order.
> > + Some(unsafe { self.remove_internal_inner(item, next, prev) })
> > + } else {
> > + None
> > + }
> > + }
> > +
> > + /// Removes the provided item from the list.
> > + ///
> > + /// # Safety
> > + ///
> > + /// The pointer must point at an item in this list.
> > + unsafe fn remove_internal(&mut self, item: *mut ListLinksFields) -> ListArc<T, ID> {
> > + // SAFETY: The caller promises that this pointer is not dangling, and there's no data race
> > + // since we have a mutable reference to the list containing `item`.
> > + let ListLinksFields { next, prev } = unsafe { *item };
> > + // SAFETY: The pointers are ok and in the right order.
> > + unsafe { self.remove_internal_inner(item, next, prev) }
> > + }
> > +
> > + /// Removes the provided item from the list.
> > + ///
> > + /// # Safety
> > + ///
> > + /// The `item` pointer must point at an item in this list, and we must have `(*item).next ==
> > + /// next` and `(*item).prev == prev`.
> > + unsafe fn remove_internal_inner(
> > + &mut self,
> > + item: *mut ListLinksFields,
> > + next: *mut ListLinksFields,
> > + prev: *mut ListLinksFields,
> > + ) -> ListArc<T, ID> {
> > + // SAFETY: We have exclusive access to items in the list, and prev/next pointers are
>
> I think you mean that you have exclusive access to the prev/next fields
> of the `ListLinks` associated with `ID`... But that is rather long.
> Does anyone have any idea to shorten this?

SAFETY: We have exclusive access to the pointers of items in the list,
and the prev/next pointers are never null for items in a list.

> > + // never null for items in a list.
> > + //
> > + // INVARIANT: There are three cases:
> > + // * If the list has at least three items, then after removing the item, `prev` and `next`
> > + // will be next to each other.
> > + // * If the list has two items, then the remaining item will point at itself.
> > + // * If the list has one item, then `next == prev == item`, so these writes have no effect
> > + // due to the writes to `item` below.
>
> I think the writes do not have an effect. (no need to reference the
> writes to `item` below)

?

> > + unsafe {
> > + (*next).prev = prev;
> > + (*prev).next = next;
> > + }
> > + // SAFETY: We have exclusive access to items in the list.
> > + // INVARIANT: The item is no longer in a list, so the pointers should be null.
> > + unsafe {
> > + (*item).prev = ptr::null_mut();
> > + (*item).next = ptr::null_mut();
> > + }
> > + // INVARIANT: There are three cases:
> > + // * If `item` was not the first item, then `self.first` should remain unchanged.
> > + // * If `item` was the first item and there is another item, then we just updated
> > + // `prev->next` to `next`, which is the new first item, and setting `item->next` to null
> > + // did not modify `prev->next`.
> > + // * If `item` was the only item in the list, then `prev == item`, and we just set
> > + // `item->next` to null, so this correctly sets `first` to null now that the list is
> > + // empty.
> > + if self.first == item {
> > + // SAFETY: The `prev` field of an item in a list is never dangling.
>
> I don't think this SAFETY comment makes sense.
>
> > + self.first = unsafe { (*prev).next };

SAFETY: `prev` is the `prev` pointer from a `ListLinks` in a `List`,
so the pointer is valid. There's no race, since we have exclusive
access to the list.

Alice