Re: [PATCH net-next v2] net: ctnetlink: support filtering by zone

From: Pablo Neira Ayuso
Date: Fri Feb 02 2024 - 06:21:37 EST


On Fri, Feb 02, 2024 at 12:04:35PM +0100, Ilya Maximets wrote:
> On 12/22/23 13:01, Pablo Neira Ayuso wrote:
> > On Mon, Nov 27, 2023 at 11:49:16AM +0000, Felix Huettner wrote:
> >> conntrack zones are heavily used by tools like openvswitch to run
> >> multiple virtual "routers" on a single machine. In this context each
> >> conntrack zone matches to a single router, thereby preventing
> >> overlapping IPs from becoming issues.
> >> In these systems it is common to operate on all conntrack entries of a
> >> given zone, e.g. to delete them when a router is deleted. Previously this
> >> required these tools to dump the full conntrack table and filter out the
> >> relevant entries in userspace potentially causing performance issues.
> >>
> >> To do this we reuse the existing CTA_ZONE attribute. This was previous
> >> parsed but not used during dump and flush requests. Now if CTA_ZONE is
> >> set we filter these operations based on the provided zone.
> >> However this means that users that previously passed CTA_ZONE will
> >> experience a difference in functionality.
> >>
> >> Alternatively CTA_FILTER could have been used for the same
> >> functionality. However it is not yet supported during flush requests and
> >> is only available when using AF_INET or AF_INET6.
> >
> > For the record, this is applied to nf-next.
>
> Hi, Felix and Pablo.
>
> I was looking through the code and the following part is bothering me:
>
> diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
> index fb0ae15e96df..4e9133f61251 100644
> --- a/net/netfilter/nf_conntrack_netlink.c
> +++ b/net/netfilter/nf_conntrack_netlink.c
> @@ -1148,6 +1149,10 @@ static int ctnetlink_filter_match(struct nf_conn *ct, void *data)
> if (filter->family && nf_ct_l3num(ct) != filter->family)
> goto ignore_entry;
>
> + if (filter->zone.id != NF_CT_DEFAULT_ZONE_ID &&
> + !nf_ct_zone_equal_any(ct, &filter->zone))
> + goto ignore_entry;
> +
> if (filter->orig_flags) {
> tuple = nf_ct_tuple(ct, IP_CT_DIR_ORIGINAL);
> if (!ctnetlink_filter_match_tuple(&filter->orig, tuple,
>
> If I'm reading that right, the default zone is always flushed, even if the
> user requested to flush a different zone. I.e. the entry is never ignored
> for a default zone. Is that correct or am I reading that wrong?
>
> If my observation is correct, then I don't think this functionality can
> actually be used by applications as it does something unexpected.

This needs a fix, the NF_CT_DEFAULT_ZONE_ID is used as a marker to
indicate if the filtering by zone needs to happen or not.

I'd suggest to add a boolean flag that specifies that zone filtering
is set on.