Re: [PATCH v3 1/5] rust: rbtree: add red-black tree implementation backed by the C version

From: Benno Lossin
Date: Thu Apr 25 2024 - 17:26:42 EST


On 18.04.24 16:15, Matt Gilbride wrote:
> diff --git a/rust/kernel/rbtree.rs b/rust/kernel/rbtree.rs
> new file mode 100644
> index 000000000000..ad406fc32d67
> --- /dev/null
> +++ b/rust/kernel/rbtree.rs
> @@ -0,0 +1,425 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +//! Red-black trees.
> +//!
> +//! C header: [`include/linux/rbtree.h`](srctree/include/linux/rbtree.h)
> +//!
> +//! Reference: <https://www.kernel.org/doc/html/latest/core-api/rbtree.html>
> +
> +use crate::{bindings, container_of, error::Result, prelude::*};
> +use alloc::boxed::Box;
> +use core::{
> + cmp::{Ord, Ordering},
> + convert::Infallible,
> + marker::PhantomData,
> + mem::MaybeUninit,
> + ptr::{addr_of_mut, NonNull},
> +};
> +
> +struct Node<K, V> {
> + links: bindings::rb_node,
> + key: K,
> + value: V,
> +}

Personal preference: I prefer putting items that give a high-level
overview of the module to the top. I don't feel like I gain anything
from seeing the definition of the `Node` type this early.

[...]

> +impl<K, V> RBTree<K, V> {
> + /// Creates a new and empty tree.
> + pub fn new() -> Self {
> + Self {
> + // INVARIANT: There are no nodes in the tree, so the invariant holds vacuously.
> + root: bindings::rb_root::default(),
> + _p: PhantomData,
> + }
> + }
> +
> + /// Allocates memory for a node to be eventually initialised and inserted into the tree via a
> + /// call to [`RBTree::insert`].
> + pub fn try_reserve_node() -> Result<RBTreeNodeReservation<K, V>> {

This function creates a `RBTreeNodeReservation`, I think it would make
sense to move it to that type and just name this function `new`.

> + Ok(RBTreeNodeReservation {
> + node: Box::init::<Infallible>(crate::init::uninit())?,

`Box::new_uninit()` probably makes more sense here. (what you did is not
wrong, but I think the intent is better captured by `new_uninit`)

> + })
> + }
> +
> + /// Allocates and initialises a node that can be inserted into the tree via
> + /// [`RBTree::insert`].
> + pub fn try_allocate_node(key: K, value: V) -> Result<RBTreeNode<K, V>> {

Same with this function, I would move it to `RBTreeNode` and call it
`new`.

> + Ok(Self::try_reserve_node()?.into_node(key, value))
> + }
> +}
> +
> +impl<K, V> RBTree<K, V>
> +where
> + K: Ord,

Citing the rust docs [1] on the requirements that implementations of the
`Ord` trait need to satsify:

"[...] Violating these requirements is a logic error. The behavior
resulting from a logic error is not specified, but users of the trait
must ensure that such logic errors do not result in undefined behavior.
This means that `unsafe` code **must not** rely on the correctness of
these methods."

I haven't yet fully checked this, since I would have to delve into the
C side. But I wanted to ask if you have given any thought to this issue.
In particular this means that you must not rely on `<` (or `cmp`) being
eg transitive in `unsafe` code.
From what I have seen in this patch, I think there are no issues with
the way you use `Ord`.

[1]: https://doc.rust-lang.org/core/cmp/trait.Ord.html

> +{

[...]

> +impl<K, V> RBTreeNodeReservation<K, V> {
> + /// Initialises a node reservation.
> + ///
> + /// It then becomes an [`RBTreeNode`] that can be inserted into a tree.
> + pub fn into_node(mut self, key: K, value: V) -> RBTreeNode<K, V> {
> + let node_ptr = self.node.as_mut_ptr();
> + // SAFETY: `node_ptr` is valid, and so are its fields.
> + unsafe { addr_of_mut!((*node_ptr).links).write(bindings::rb_node::default()) };
> + // SAFETY: `node_ptr` is valid, and so are its fields.
> + unsafe { addr_of_mut!((*node_ptr).key).write(key) };
> + // SAFETY: `node_ptr` is valid, and so are its fields.
> + unsafe { addr_of_mut!((*node_ptr).value).write(value) };
> + RBTreeNode {
> + // SAFETY: The pointer came from a `MaybeUninit<Node>` whose fields have all been
> + // initialised. Additionally, it has the same layout as `Node`.
> + node: unsafe { Box::<MaybeUninit<_>>::assume_init(self.node) },
> + }

I really dislike the verbosity of this function. Also what will ensure
that you really did initialize all fields? I think I have a way to
improve this using a new function on `Box`:

impl<T> Box<MaybeUninit<T>> {
fn re_init(self, init: impl Init<T, E>) -> Result<Box<T>, E>;
}

Then you could do this instead:

pub fn into_node(mut self, key: K, value: V) -> RBTreeNode<K, V> {
let node = init!(Node {
key,
value,
links: bindings::rb_node::default(),
});
RBTreeNode { node: self.node.re_init(node) }
}

All the `unsafe` vanishes!

I think this is useful in general, so I am going to send a patch with
the above mentioned method. In addition to that I am also going to
extend `Box` to allow converting `Box<T> -> Box<MaybeUninit<T>>` to
simplify `into_reservation` from patch 5.

--
Cheers,
Benno

> + }
> +}
> +
> +/// A red-black tree node.
> +///
> +/// The node is fully initialised (with key and value) and can be inserted into a tree without any
> +/// extra allocations or failure paths.
> +pub struct RBTreeNode<K, V> {
> + node: Box<Node<K, V>>,
> +}
> +
> +// SAFETY: If K and V can be sent across threads, then it's also okay to send [`RBTreeNode`] across
> +// threads.
> +unsafe impl<K: Send, V: Send> Send for RBTreeNode<K, V> {}
> +
> +// SAFETY: If K and V can be accessed without synchronization, then it's also okay to access
> +// [`RBTreeNode`] without synchronization.
> +unsafe impl<K: Sync, V: Sync> Sync for RBTreeNode<K, V> {}
>
> --
> 2.44.0.769.g3c40516874-goog
>