Re: [PATCH 5/9] rust: list: add List

From: Benno Lossin
Date: Thu Apr 04 2024 - 10:51:32 EST


On 04.04.24 16:12, Alice Ryhl wrote:
> On Thu, Apr 4, 2024 at 4:03 PM Benno Lossin <benno.lossin@xxxxxxxxx> wrote:
>> On 02.04.24 14:17, Alice Ryhl wrote:
>>> +
>>> + if self.first.is_null() {
>>> + self.first = item;
>>> + // SAFETY: The caller just gave us ownership of these fields.
>>> + // INVARIANT: A linked list with one item should be cyclic.
>>> + unsafe {
>>> + (*item).next = item;
>>> + (*item).prev = item;
>>> + }
>>> + } else {
>>> + let next = self.first;
>>> + // SAFETY: We just checked that `next` is non-null.
>>
>> Missing mention of the type invariant.
>
> SAFETY: By the type invariant, this pointer is valid or null. We just
> checked that it's not null, so it must be valid.

Sounds good.

[...]

>>> + /// Removes the first item from this list.
>>> + pub fn pop_front(&mut self) -> Option<ListArc<T, ID>> {
>>> + if self.first.is_null() {
>>> + return None;
>>> + }
>>> +
>>> + // SAFETY: The first item of this list is in this list.
>>> + Some(unsafe { self.remove_internal(self.first) })
>>> + }
>>> +
>>> + /// Removes the provided item from this list and returns it.
>>> + ///
>>> + /// This returns `None` if the item is not in the list.
>>
>> I think this should say "Returns `None` if the item is not in a list.".
>> (Technically it should be "is not in a `List<T, ID>`", since it *can* be
>> in another list with a different ID.)
>
> I'm not really convinced. The phrases "the list" and "a list" are
> equivalent given the safety requirement for this method, but "the
> list" seems more natural to me. The `remove` method of any other
> collection would say "the list" too.

They are equivalent, but saying "the list" has the potential for this
confusion: "If the function returns `None` if the item is not in the
list, then why do I need to ensure that it is not in a different list?".
>
>>> + ///
>>> + /// # Safety
>>> + ///
>>> + /// The provided item must not be in a different linked list.
>>> + pub unsafe fn remove(&mut self, item: &T) -> Option<ListArc<T, ID>> {
>>> + let mut item = unsafe { ListLinks::fields(T::view_links(item)) };
>>> + // SAFETY: The user provided a reference, and reference are never dangling.
>>> + //
>>> + // As for why this is not a data race, there are two cases:
>>> + //
>>> + // * If `item` is not in any list, then these fields are read-only and null.
>>> + // * If `item` is in this list, then we have exclusive access to these fields since we
>>> + // have a mutable reference to the list.
>>> + //
>>> + // In either case, there's no race.
>>> + let ListLinksFields { next, prev } = unsafe { *item };
>>> +
>>> + debug_assert_eq!(next.is_null(), prev.is_null());
>>> + if !next.is_null() {
>>> + // This is really a no-op, but this ensures that `item` is a raw pointer that was
>>> + // obtained without going through a pointer->reference->pointer conversion rountrip.
>>> + // This ensures that the list is valid under the more restrictive strict provenance
>>> + // ruleset.
>>> + //
>>> + // SAFETY: We just checked that `next` is not null, and it's not dangling by the
>>> + // list invariants.
>>> + unsafe {
>>> + debug_assert_eq!(item, (*next).prev);
>>> + item = (*next).prev;
>>> + }
>>> +
>>> + // SAFETY: We just checked that `item` is in a list, so the caller guarantees that it
>>> + // is in this list. The pointers are in the right order.
>>> + Some(unsafe { self.remove_internal_inner(item, next, prev) })
>>> + } else {
>>> + None
>>> + }
>>> + }
>>> +
>>> + /// Removes the provided item from the list.
>>> + ///
>>> + /// # Safety
>>> + ///
>>> + /// The pointer must point at an item in this list.
>>> + unsafe fn remove_internal(&mut self, item: *mut ListLinksFields) -> ListArc<T, ID> {
>>> + // SAFETY: The caller promises that this pointer is not dangling, and there's no data race
>>> + // since we have a mutable reference to the list containing `item`.
>>> + let ListLinksFields { next, prev } = unsafe { *item };
>>> + // SAFETY: The pointers are ok and in the right order.
>>> + unsafe { self.remove_internal_inner(item, next, prev) }
>>> + }
>>> +
>>> + /// Removes the provided item from the list.
>>> + ///
>>> + /// # Safety
>>> + ///
>>> + /// The `item` pointer must point at an item in this list, and we must have `(*item).next ==
>>> + /// next` and `(*item).prev == prev`.
>>> + unsafe fn remove_internal_inner(
>>> + &mut self,
>>> + item: *mut ListLinksFields,
>>> + next: *mut ListLinksFields,
>>> + prev: *mut ListLinksFields,
>>> + ) -> ListArc<T, ID> {
>>> + // SAFETY: We have exclusive access to items in the list, and prev/next pointers are
>>
>> I think you mean that you have exclusive access to the prev/next fields
>> of the `ListLinks` associated with `ID`... But that is rather long.
>> Does anyone have any idea to shorten this?
>
> SAFETY: We have exclusive access to the pointers of items in the list,
> and the prev/next pointers are never null for items in a list.

I would say that they are valid instead of never null, since you
dereference them below. Otherwise sounds good.

>
>>> + // never null for items in a list.
>>> + //
>>> + // INVARIANT: There are three cases:
>>> + // * If the list has at least three items, then after removing the item, `prev` and `next`
>>> + // will be next to each other.
>>> + // * If the list has two items, then the remaining item will point at itself.
>>> + // * If the list has one item, then `next == prev == item`, so these writes have no effect
>>> + // due to the writes to `item` below.
>>
>> I think the writes do not have an effect. (no need to reference the
>> writes to `item` below)
>
> ?

The first write is

(*next).prev = prev;

Using the fact that `next == prev == item` we have

(*item).prev = prev;

But that is already true, since the function requirement is that
`(*item).prev == prev`. So the write has no effect.
The same should hold for `(*prev).next = next`.

>
>>> + unsafe {
>>> + (*next).prev = prev;
>>> + (*prev).next = next;
>>> + }
>>> + // SAFETY: We have exclusive access to items in the list.
>>> + // INVARIANT: The item is no longer in a list, so the pointers should be null.
>>> + unsafe {
>>> + (*item).prev = ptr::null_mut();
>>> + (*item).next = ptr::null_mut();
>>> + }
>>> + // INVARIANT: There are three cases:
>>> + // * If `item` was not the first item, then `self.first` should remain unchanged.
>>> + // * If `item` was the first item and there is another item, then we just updated
>>> + // `prev->next` to `next`, which is the new first item, and setting `item->next` to null
>>> + // did not modify `prev->next`.
>>> + // * If `item` was the only item in the list, then `prev == item`, and we just set
>>> + // `item->next` to null, so this correctly sets `first` to null now that the list is
>>> + // empty.
>>> + if self.first == item {
>>> + // SAFETY: The `prev` field of an item in a list is never dangling.
>>
>> I don't think this SAFETY comment makes sense.
>>
>>> + self.first = unsafe { (*prev).next };
>
> SAFETY: `prev` is the `prev` pointer from a `ListLinks` in a `List`,
> so the pointer is valid. There's no race, since we have exclusive
> access to the list.

Sounds good.

--
Cheers,
Benno