Re: [PATCH v8 5/6] rust: rbtree: add `RBTreeCursor`

From: Alice Ryhl
Date: Tue Aug 06 2024 - 04:25:33 EST


On Mon, Aug 5, 2024 at 9:35 PM Benno Lossin <benno.lossin@xxxxxxxxx> wrote:
>
> On 27.07.24 22:30, Matt Gilbride wrote:
> > + /// Returns a cursor over the tree nodes based on the given key.
> > + ///
> > + /// If the given key exists, the cursor starts there.
> > + /// Otherwise it starts with the first larger key in sort order.
> > + /// If there is no larger key, it returns [`None`].
> > + pub fn cursor_lower_bound(&mut self, key: &K) -> Option<RBTreeCursor<'_, K, V>>
> > + where
> > + K: Ord,
> > + {
> > + let mut node = self.root.rb_node;
> > + let mut best_match: Option<NonNull<Node<K, V>>> = None;
> > + while !node.is_null() {
> > + // SAFETY: By the type invariant of `Self`, all non-null `rb_node` pointers stored in `self`
> > + // point to the links field of `Node<K, V>` objects.
> > + let this = unsafe { container_of!(node, Node<K, V>, links) }.cast_mut();
> > + // SAFETY: `this` is a non-null node so it is valid by the type invariants.
> > + let this_key = unsafe { &(*this).key };
> > + // SAFETY: `node` is a non-null node so it is valid by the type invariants.
> > + let left_child = unsafe { (*node).rb_left };
> > + // SAFETY: `node` is a non-null node so it is valid by the type invariants.
> > + let right_child = unsafe { (*node).rb_right };
> > + if key == this_key {
> > + return NonNull::new(node).map(|current| {
> > + // INVARIANT:
> > + // - `node` is a valid node in the [`RBTree`] pointed to by `self`.
> > + // - Due to the type signature of this function, the returned [`RBTreeCursor`]
> > + // borrows mutably from `self`.
> > + RBTreeCursor {
> > + current,
> > + tree: self,
> > + }
> > + });
> > + } else {
> > + node = if key > this_key {
> > + right_child
> > + } else {
> > + let is_better_match = match best_match {
> > + None => true,
> > + Some(best) => {
> > + // SAFETY: `best` is a non-null node so it is valid by the type invariants.
> > + let best_key = unsafe { &(*best.as_ptr()).key };
> > + best_key > this_key
> > + }
> > + };
> > + if is_better_match {
> > + best_match = NonNull::new(this);
> > + }
> > + left_child
> > + };
> > + }
> > + }
> > +
> > + let best = best_match?;
> > +
> > + // SAFETY: `best` is a non-null node so it is valid by the type invariants.
> > + let links = unsafe { addr_of_mut!((*best.as_ptr()).links) };
> > +
> > + NonNull::new(links).map(|current| {
>
> Why would `links` be a null pointer? AFAIK it just came from `best`
> which is non-null. (I don't know if we want to use `new_unchecked`
> instead, but wanted to mention it)

It's never a null pointer in this branch. Do you prefer an extra
unsafe block to call new_unchecked?

> > + // INVARIANT:
> > + // - `current` is a valid node in the [`RBTree`] pointed to by `self`.
> > + // - Due to the type signature of this function, the returned [`RBTreeCursor`]
> > + // borrows mutably from `self`.
> > + RBTreeCursor {
> > + current,
> > + tree: self,
> > + }
> > + })
> > + }
>
> [...]
>
> > +/// // Calling `remove_next` removes and returns the last element.
> > +/// assert_eq!(cursor.remove_next().unwrap().to_key_value(), (30, 300));
> > +///
> > +/// # Ok::<(), Error>(())
> > +/// ```
>
> I would put a newline here.

Ok.

> > +/// # Invariants
> > +/// - `current` points to a node that is in the same [`RBTree`] as `tree`.
> > +pub struct RBTreeCursor<'a, K, V> {
>
> I think we can name it just `Cursor`, since one can refer to it as
> `rbtree::Cursor` and then it also follows the naming scheme for `Iter`
> etc.

You are welcome to submit that as a follow-up change.

> > + tree: &'a mut RBTree<K, V>,
> > + current: NonNull<bindings::rb_node>,
> > +}
> > +
> > +// SAFETY: The [`RBTreeCursor`] gives out immutable references to K and mutable references to V,
> > +// so it has the same thread safety requirements as mutable references.
> > +unsafe impl<'a, K: Send, V: Send> Send for RBTreeCursor<'a, K, V> {}
>
> Again, do we want to use `K: Sync` here instead?

In this case, `K: Send` and `K: Sync` are both sufficient conditions,
but `K: Send` will generally be less restrictive for the user.

> > + fn peek(&self, direction: Direction) -> Option<(&K, &V)> {
> > + self.get_neighbor_raw(direction)
> > + // SAFETY:
> > + // - `neighbor` is a valid tree node.
> > + // - By the function signature, we have an immutable reference to `self`.
> > + .map(|neighbor| unsafe { Self::to_key_value(neighbor) })
>
> Alternative way of formatting this:
>
> self.get_neighbor_raw(direction).map(|neighbor| {
> // SAFETY:
> // - `neighbor` is a valid tree node.
> // - By the function signature, we have an immutable reference to `self`.
> unsafe { Self::to_key_value(neighbor) }
> })
>
> I think it looks nicer, but we should probably have a written
> preference.

We can reformat since we need another version anyway, but otherwise I
would have asked you to make this a follow-up change.

> > + }
> > +
> > + /// Access the previous node mutably without moving the cursor.
> > + pub fn peek_prev_mut(&mut self) -> Option<(&K, &mut V)> {
> > + self.peek_mut(Direction::Prev)
> > + }
> > +
> > + /// Access the next node mutably without moving the cursor.
> > + pub fn peek_next_mut(&mut self) -> Option<(&K, &mut V)> {
> > + self.peek_mut(Direction::Next)
> > + }
> > +
> > + fn peek_mut(&mut self, direction: Direction) -> Option<(&K, &mut V)> {
> > + self.get_neighbor_raw(direction)
> > + // SAFETY:
> > + // - `neighbor` is a valid tree node.
> > + // - By the function signature, we have a mutable reference to `self`.
> > + .map(|neighbor| unsafe { Self::to_key_value_mut(neighbor) })
>
> Ditto.

Ditto.

Alice