Re: [PATCH nf] netfilter: nf_tables: unconditionally flush pending work before notifier

From: Hillf Danton
Date: Thu Jul 04 2024 - 06:35:50 EST


On Wed, 3 Jul 2024 15:01:07 +0200 Florian Westphal <fw@xxxxxxxxx>
> Hillf Danton <hdanton@xxxxxxxx> wrote:
> > On Wed, 3 Jul 2024 12:52:15 +0200 Florian Westphal <fw@xxxxxxxxx>
> > > Hillf Danton <hdanton@xxxxxxxx> wrote:
> > > > Given trans->table goes thru the lifespan of trans, your proposal is a bandaid
> > > > if trans outlives table.
> > >
> > > trans must never outlive table.
> > >
> > What is preventing trans from being freed after closing sock, given
> > trans is freed in workqueue?
> >
> > close sock
> > queue work
>
> The notifier acquires the transaction mutex, locking out all other
> transactions, so no further transactions requests referencing
> the table can be queued.
>
As per the syzbot report, trans->table could be instantiated before
notifier acquires the transaction mutex. And in fact the lock helps
trans outlive table even with your patch.

cpu1 cpu2
--- ---
transB->table = A
lock trans mutex
flush work
free A
unlock trans mutex

queue work to free transB

> The work queue is flushed before potentially ripping the table
> out. After this, no transactions referencing the table can exist
> anymore; the only transactions than can still be queued are those
> coming from a different netns, and tables are scoped per netns.
>
> Table is torn down. Transaction mutex is released.
>
> Next transaction from userspace can't find the table anymore (its gone),
> so no more transactions can be queued for this table.
>
> As I wrote in the commit message, the flush is dumb, this should first
> walk to see if there is a matching table to be torn down, and then flush
> work queue once before tearing the table down.
>
> But its better to clearly split bug fix and such a change.