Re: [PATCH net-next v2] net: ctnetlink: support filtering by zone

From: Felix Huettner
Date: Fri Feb 02 2024 - 07:24:24 EST


On Fri, Feb 02, 2024 at 12:12:03PM +0100, Pablo Neira Ayuso wrote:
> On Fri, Feb 02, 2024 at 12:04:35PM +0100, Ilya Maximets wrote:
> > On 12/22/23 13:01, Pablo Neira Ayuso wrote:
> > > On Mon, Nov 27, 2023 at 11:49:16AM +0000, Felix Huettner wrote:
> > >> conntrack zones are heavily used by tools like openvswitch to run
> > >> multiple virtual "routers" on a single machine. In this context each
> > >> conntrack zone matches to a single router, thereby preventing
> > >> overlapping IPs from becoming issues.
> > >> In these systems it is common to operate on all conntrack entries of a
> > >> given zone, e.g. to delete them when a router is deleted. Previously this
> > >> required these tools to dump the full conntrack table and filter out the
> > >> relevant entries in userspace potentially causing performance issues.
> > >>
> > >> To do this we reuse the existing CTA_ZONE attribute. This was previous
> > >> parsed but not used during dump and flush requests. Now if CTA_ZONE is
> > >> set we filter these operations based on the provided zone.
> > >> However this means that users that previously passed CTA_ZONE will
> > >> experience a difference in functionality.
> > >>
> > >> Alternatively CTA_FILTER could have been used for the same
> > >> functionality. However it is not yet supported during flush requests and
> > >> is only available when using AF_INET or AF_INET6.
> > >
> > > For the record, this is applied to nf-next.
> >
> > Hi, Felix and Pablo.
> >
> > I was looking through the code and the following part is bothering me:
> >
> > diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
> > index fb0ae15e96df..4e9133f61251 100644
> > --- a/net/netfilter/nf_conntrack_netlink.c
> > +++ b/net/netfilter/nf_conntrack_netlink.c
> > @@ -1148,6 +1149,10 @@ static int ctnetlink_filter_match(struct nf_conn *ct, void *data)
> > if (filter->family && nf_ct_l3num(ct) != filter->family)
> > goto ignore_entry;
> >
> > + if (filter->zone.id != NF_CT_DEFAULT_ZONE_ID &&
> > + !nf_ct_zone_equal_any(ct, &filter->zone))
> > + goto ignore_entry;
> > +
> > if (filter->orig_flags) {
> > tuple = nf_ct_tuple(ct, IP_CT_DIR_ORIGINAL);
> > if (!ctnetlink_filter_match_tuple(&filter->orig, tuple,
> >
> > If I'm reading that right, the default zone is always flushed, even if the
> > user requested to flush a different zone. I.e. the entry is never ignored
> > for a default zone. Is that correct or am I reading that wrong?
> >
> > If my observation is correct, then I don't think this functionality can
> > actually be used by applications as it does something unexpected.
>
> This needs a fix, the NF_CT_DEFAULT_ZONE_ID is used as a marker to
> indicate if the filtering by zone needs to happen or not.
>
> I'd suggest to add a boolean flag that specifies that zone filtering
> is set on.

Hi Pablo and Ilya,

thanks for finding that.
i will build a fix for that.