Re: [PATCH v5 06/10] rbtree: Implement generic latch_tree

From: Ingo Molnar
Date: Mon Apr 13 2015 - 12:43:24 EST



* Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

> Implement a latched RB-tree in order to get unconditional RCU/lockless
> lookups.
>
> Cc: Oleg Nesterov <oleg@xxxxxxxxxx>
> Cc: Michel Lespinasse <walken@xxxxxxxxxx>
> Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx>
> Cc: David Woodhouse <David.Woodhouse@xxxxxxxxx>
> Cc: Rik van Riel <riel@xxxxxxxxxx>
> Cc: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>
> Cc: "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> ---
> include/linux/rbtree_latch.h | 212 +++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 212 insertions(+)
>
> --- /dev/null
> +++ b/include/linux/rbtree_latch.h
> @@ -0,0 +1,212 @@
> +/*
> + * Latched RB-trees
> + *
> + * Copyright (C) 2015 Intel Corp., Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> + *
> + * Since RB-trees have non atomic modifications they're not immediately suited
> + * for RCU/lockless queries. Even though we made RB tree lookups non-fatal for
> + * lockless lookups; we cannot guarantee they return a correct result.
> + *
> + * The simplest solution is a seqlock + rb-tree, this will allow lockless
> + * lookups; but has the constraint (inherent to the seqlock) that read sides
> + * cannot nest in write sides.
> + *
> + * If we need to allow unconditional lookups (say as required for NMI context
> + * usage) we need a more complex setup; this data structure provides this by
> + * employing the latch technique -- see @raw_write_seqcount_latch -- to
> + * implement a latched RB-tree which does allow for unconditional lookups by
> + * virtue of always having (at least) one stable copy of the tree.
> + *
> + * However, while we have the guarantee that there is at all times one stable
> + * copy, this does not guarantee an iteration will not observe modifications.
> + * What might have been a stable copy at the start of the iteration, need not
> + * remain so for the duration of the iteration.
> + *
> + * Therefore, this does require a lockless RB-tree iteration to be non-fatal;
> + * see the comment in lib/rbtree.c. Note however that we only require the first
> + * condition -- not seeing partial stores -- because the latch thing isolates
> + * us from loops. If we were to interrupt a modification the lookup would be
> + * pointed at the stable tree and complete while the modification was halted.

Minor nit: so this text has 3 variants to spell RB-trees:

RB-tree
RB tree
rb-tree

I suggest we pick one! :-)

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/