Re: [PATCH] quota: fix race condition between dqput() and dquot_mark_dquot_dirty()

From: Baokun Li
Date: Tue Jun 27 2023 - 10:06:58 EST


On 2023/6/27 17:28, Jan Kara wrote:
On Tue 27-06-23 17:08:27, Baokun Li wrote:
Hello!

On 2023/6/27 16:34, Jan Kara wrote:
Hello!

On Mon 26-06-23 21:55:49, Baokun Li wrote:
On 2023/6/26 21:09, Jan Kara wrote:
On Sun 25-06-23 15:56:10, Baokun Li wrote:
I think we can simply focus on the race between the DQ_ACTIVE_B flag and
the DQ_MOD_B flag, which is the core problem, because the same quota
should not have both flags. These two flags are protected by dq_list_lock
and dquot->dq_lock respectively, so it makes sense to add a
wait_on_dquot() to ensure the accuracy of DQ_ACTIVE_B.
But the fundamental problem is not only the race with DQ_MOD_B setting. The
dquot structure can be completely freed by the time
dquot_claim_space_nodirty() calls dquot_mark_dquot_dirty() on it. That's
why I think making __dquot_transfer() obey dquot_srcu rules is the right
solution.
Yes, now I also think that making __dquot_transfer() obey dquot_srcu
rules is a better solution. But with inode->i_lock protection, why would
the dquot structure be completely freed?
Well, when dquot_claim_space_nodirty() calls mark_all_dquot_dirty() it does
not hold any locks (only dquot_srcu). So nothing prevents dquot_transfer()
to go, swap dquot structure pointers and drop dquot references and after
that mark_all_dquot_dirty() can use a stale pointer to call
mark_dquot_dirty() on already freed memory.

No, this doesn't look like it's going to happen. The
mark_all_dquot_dirty() uses a pointer array pointer, the dquot in the
array is dynamically changing, so after swap dquot structure pointers,
mark_all_dquot_dirty() uses the new pointer, and the stale pointer is
always destroyed after swap, so there is no case of using the stale
pointer here.
There is a case - CPU0 can prefetch the values from dquots[] array into its
local cache, then CPU1 can update the dquots[] array (these writes can
happily stay in CPU1 store cache invisible to other CPUs) and free the
dquots via dqput(). Then CPU0 can pass the prefetched dquot pointers to
mark_dquot_dirty(). There are no locks or memory barries preventing CPUs
from ordering instructions and memory operations like this in the code...
You can read Documentation/memory-barriers.txt about all the perils current
CPU architecture brings wrt coordination of memory accesses among CPUs ;)

Honza
Got it!

Sorry for misunderstanding you (I thought "completely freed" meant
dquot_destroy(), but you should have meant dquot_release()).
Well, the dquot can even get to dquot_destroy(). There's nothing really
preventing CPU2 going into memory reclaim and free the dquot in
dqcache_shrink_scan() still before CPU0 even calls mark_dquot_dirty() on
it. Sure such timing on real hardware is very unlikely but in a VM where a
virtual CPU can get starved for a significant amount of time this could
happen.

Honza
Yes, invalidate_dquots() calling do_destroy_dquot() does not have this problem
because it calls synchronize_srcu(&dquot_srcu) in drop_dquot_ref() before.

However, calling do_destroy_dquot() from dqcache_shrink_scan() is not
protected, and calling dqcache_shrink_scan() after P3 execution will trigger
the UAF by calling do_destroy_dquot() twice, as shown in function graph 1
in the patch description; If dqcache_shrink_scan() is called after dquot is
added to free_dquots and before P3 is executed, the UAF may be
triggered in dquot_mark_dquot_dirty().

Thank you for your patient explanation!
The new version of the solution is almost complete, and is doing some stress
testing, which I will send out once it passes.
--
With Best Regards,
Baokun Li
.