Re: [PATCH nf] netfilter: nf_tables: use RCU-safe list primitives for basechain hook list

From: Florian Westphal

Date: Fri Apr 10 2026 - 06:34:52 EST


Weiming Shi <bestswngs@xxxxxxxxx> wrote:
> NFT_MSG_GETCHAIN runs as an NFNL_CB_RCU callback, so chain dumps
> traverse basechain->hook_list under rcu_read_lock() without holding
> commit_mutex. Meanwhile, nft_delchain_hook() mutates that same live
> hook_list with plain list_move() and list_splice(), and the commit/abort
> paths splice hooks back with plain list_splice(). None of these are
> RCU-safe list operations.
>
> A concurrent GETCHAIN dump can observe partially updated list pointers,
> follow them into stack-local or transaction-private list heads, and
> crash when container_of() produces a bogus struct nft_hook pointer.

Right, but this is broken by design.

> Replace list_move() in nft_delchain_hook() with list_del_rcu() plus an
> intermediate pointer array, followed by synchronize_rcu() before the
> deleted hooks' list pointers are reused to link them into the
> transaction's private list. In the error paths, put hooks back with
> list_add_tail_rcu() which is safe for concurrent RCU readers (they
> either continue to the original successor or see the list head and
> terminate the walk).

I don't understand the existing code.

I don't even understand why
we have a difference between the 'update delete' and chain delete cases.

I think its wrong to unlink and then relink on abort.
What prevents nft_delchain_hook() from using the normal approach done
by nft_delchain()...?

This existing code appears to be way too complex.