Re: [RESEND] [PATCH bpf-next 2/3] bpf: Overwrite the element in hash map atomically
From: Hou Tao
Date: Tue Feb 04 2025 - 20:39:08 EST
+cc Cody Haas
Sorry for the resend. I sent the reply in the HTML format.
On 2/4/2025 4:28 PM, Hou Tao wrote:
> Currently, the update of existing element in hash map involves two
> steps:
> 1) insert the new element at the head of the hash list
> 2) remove the old element
>
> It is possible that the concurrent lookup operation may fail to find
> either the old element or the new element if the lookup operation starts
> before the addition and continues after the removal.
>
> Therefore, replacing the two-step update with an atomic update. After
> the change, the update will be atomic in the perspective of the lookup
> operation: it will either find the old element or the new element.
>
> Signed-off-by: Hou Tao <hotforest@xxxxxxxxx>
> ---
> kernel/bpf/hashtab.c | 14 ++++++++------
> 1 file changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> index 4a9eeb7aef85..a28b11ce74c6 100644
> --- a/kernel/bpf/hashtab.c
> +++ b/kernel/bpf/hashtab.c
> @@ -1179,12 +1179,14 @@ static long htab_map_update_elem(struct bpf_map *map, void *key, void *value,
> goto err;
> }
>
> - /* add new element to the head of the list, so that
> - * concurrent search will find it before old elem
> - */
> - hlist_nulls_add_head_rcu(&l_new->hash_node, head);
> - if (l_old) {
> - hlist_nulls_del_rcu(&l_old->hash_node);
> + if (!l_old) {
> + hlist_nulls_add_head_rcu(&l_new->hash_node, head);
> + } else {
> + /* Replace the old element atomically, so that
> + * concurrent search will find either the new element or
> + * the old element.
> + */
> + hlist_nulls_replace_rcu(&l_new->hash_node, &l_old->hash_node);
>
> /* l_old has already been stashed in htab->extra_elems, free
> * its special fields before it is available for reuse. Also
>
After thinking about it the second time, the atomic list replacement on
the update side is enough to make lookup operation always find the
existing element. However, due to the immediate reuse, the lookup may
find an unexpected value. Maybe we should disable the immediate reuse
for specific map (e.g., htab of maps).