Re:Re: [PATCH-cpuset v10 1/2] Union-Find: add a new module in kernel library

From: Xavier
Date: Wed Jul 03 2024 - 07:31:37 EST


Hi Michal,

At 2024-07-03 17:40:25, "Michal Koutný" <mkoutny@xxxxxxxx> wrote:
>On Wed, Jul 03, 2024 at 02:37:26PM GMT, Xavier <xavier_qy@xxxxxxx> wrote:
>> This patch implements a union-find data structure in the kernel library,
>> which includes operations for allocating nodes, freeing nodes,
>> finding the root of a node, and merging two nodes.
>>
>> Signed-off-by: Xavier <xavier_qy@xxxxxxx>
>> ---
>> Documentation/core-api/union_find.rst | 102 ++++++++++++++++++
>> .../zh_CN/core-api/union_find.rst | 87 +++++++++++++++
>> MAINTAINERS | 9 ++
>> include/linux/union_find.h | 41 +++++++
>> lib/Makefile | 2 +-
>> lib/union_find.c | 48 +++++++++
>> 6 files changed, 288 insertions(+), 1 deletion(-)
>> create mode 100644 Documentation/core-api/union_find.rst
>> create mode 100644 Documentation/translations/zh_CN/core-api/union_find.rst
>> create mode 100644 include/linux/union_find.h
>> create mode 100644 lib/union_find.c
>>
>
>Nice.
>I'd so s/Union-Find/union-find/ both in the docs and the code
>(comments), I didn't find any rule why two capitalizations are used.

Union-Find only appears in the patch description or title; in the main text, we consistently
use union-find. This will be corrected in the next version.

>> +struct uf_node {
>> + struct uf_node *parent;
>> + unsigned int rank;
>> +};
>> +
>> +/* This macro is used for static initialization of a union-find node. */
>> +#define UF_INIT_NODE(node) {.parent = &node, .rank = 0}
>> +
>> +/**
>> + * uf_node_init - Initialize a union-find node
>> + * @node: pointer to the union-find node to be initialized
>> + *
>> + * This function sets the parent of the node to itself and
>> + * initializes its rank to 0.
>> + */
>> +static inline void uf_node_init(struct uf_node *node)
>> +{
>> + node->parent = node;
>> + node->rank = 0;
>> +}
>
>Xavier, not sure if you responded to my suggestion of considered zeroed
>object a valid initialized one. That could save some init work (and
>move it to alternative uf_find, see below).
>
>With uf_find body checking for NULL:
>
> while (node->parent != node) {
> parent = node->parent;
> node->parent = parent ? parent->parent : node;
> node = node->parent;
> }

Yes, I noticed your suggestion. In patch v4, I implemented it by initializing to 0
and adding a check for whether the parent is 0 in uf_find. However, later,
when I was reviewing the algorithm's documentation, I noticed it requires
initialization to itself. Moreover, uf_find is a high-frequency operation, if we
add a parent check within it, the efficiency impact each time would be more
significant than initializing once. Therefore, I adhered to the initialization
to itself approach.

>> +/**
>> + * uf_union - Merge two sets, using union by rank
>> + * @node1: the first node
>> + * @node2: the second node
>> + *
>> + * This function merges the sets containing node1 and node2, by comparing
>> + * the ranks to keep the tree balanced.
>> + */
>> +void uf_union(struct uf_node *node1, struct uf_node *node2)
>> +{
>> + struct uf_node *root1 = uf_find(node1);
>> + struct uf_node *root2 = uf_find(node2);
>> +
>> + if (root1 != root2) {
>
>if (root1 == root2)
> return;
>then the rest can be one level less nested ;-)
>
Of course, this change makes it look clearer.

--
Best Regards,
Xavier