Re: [PATCH] gfs2: Fix uaf for qda in gfs2_quota_sync

From: Bob Peterson
Date: Tue Aug 22 2023 - 15:33:23 EST


On 1/26/23 11:10 PM, eadavis@xxxxxxxx wrote:
From: Edward Adam Davis <eadavis@xxxxxxxx>

[ 81.372851][ T5532] CPU: 1 PID: 5532 Comm: syz-executor.0 Not tainted 6.2.0-rc1-syzkaller-dirty #0
[ 81.382080][ T5532] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/12/2023
[ 81.392343][ T5532] Call Trace:
[ 81.395654][ T5532] <TASK>
[ 81.398603][ T5532] dump_stack_lvl+0x1b1/0x290
[ 81.418421][ T5532] gfs2_assert_warn_i+0x19a/0x2e0
[ 81.423480][ T5532] gfs2_quota_cleanup+0x4c6/0x6b0
[ 81.428611][ T5532] gfs2_make_fs_ro+0x517/0x610
[ 81.457802][ T5532] gfs2_withdraw+0x609/0x1540
[ 81.481452][ T5532] gfs2_inode_refresh+0xb2d/0xf60
[ 81.506658][ T5532] gfs2_instantiate+0x15e/0x220
[ 81.511504][ T5532] gfs2_glock_wait+0x1d9/0x2a0
[ 81.516352][ T5532] do_sync+0x485/0xc80
[ 81.554943][ T5532] gfs2_quota_sync+0x3da/0x8b0
[ 81.559738][ T5532] gfs2_sync_fs+0x49/0xb0
[ 81.564063][ T5532] sync_filesystem+0xe8/0x220
[ 81.568740][ T5532] generic_shutdown_super+0x6b/0x310
[ 81.574112][ T5532] kill_block_super+0x79/0xd0
[ 81.578779][ T5532] deactivate_locked_super+0xa7/0xf0
[ 81.584064][ T5532] cleanup_mnt+0x494/0x520
[ 81.593753][ T5532] task_work_run+0x243/0x300
[ 81.608837][ T5532] exit_to_user_mode_loop+0x124/0x150
[ 81.614232][ T5532] exit_to_user_mode_prepare+0xb2/0x140
[ 81.619820][ T5532] syscall_exit_to_user_mode+0x26/0x60
[ 81.625287][ T5532] do_syscall_64+0x49/0xb0
[ 81.629710][ T5532] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 81.636292][ T5532] RIP: 0033:0x7efdd688d517
[ 81.640728][ T5532] Code: ff ff ff f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 00 00 00 b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
[ 81.660550][ T5532] RSP: 002b:00007fff34520ce8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
[ 81.669413][ T5532] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007efdd688d517
[ 81.677403][ T5532] RDX: 00007fff34520db9 RSI: 000000000000000a RDI: 00007fff34520db0
[ 81.685388][ T5532] RBP: 00007fff34520db0 R08: 00000000ffffffff R09: 00007fff34520b80
[ 81.695973][ T5532] R10: 0000555555ca38b3 R11: 0000000000000246 R12: 00007efdd68e6b24
[ 81.704152][ T5532] R13: 00007fff34521e70 R14: 0000555555ca3810 R15: 00007fff34521eb0
[ 81.712868][ T5532] </TASK>

The function "gfs2_quota_cleanup()" may be called in the function "do_sync()",
This will cause the qda obtained in the function "qd_check_sync" to be released, resulting in the occurrence of uaf.
In order to avoid this uaf, we can increase the judgment of "sdp->sd_quota_bitmap" released in the function
"gfs2_quota_cleanup" to confirm that "sdp->sd_quota_list" has been released.

Link: https://lore.kernel.org/all/0000000000002b5e2405f14e860f@xxxxxxxxxx
Reported-and-tested-by: syzbot+3f6a670108ce43356017@xxxxxxxxxxxxxxxxxxxxxxxxx
Signed-off-by: Edward Adam Davis <eadavis@xxxxxxxx>
---
fs/gfs2/quota.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c
index 1ed1722..4cf66bd 100644
--- a/fs/gfs2/quota.c
+++ b/fs/gfs2/quota.c
@@ -1321,6 +1321,9 @@ int gfs2_quota_sync(struct super_block *sb, int type)
qda[x]->qd_sync_gen =
sdp->sd_quota_sync_gen;
+ if (!sdp->sd_quota_bitmap)
+ break;
+
for (x = 0; x < num_qd; x++)
qd_unlock(qda[x]);
}

Hi Edward,

Can you try to recreate this problem on a newer version of the gfs2 code?

In the linux-gfs2 repository I've got a branch called "bobquota" that has a bunch of patches related to quota syncing. I don't know if these will fix your problem, but it's worth trying.

The thing is, the qda array should have been populated by previous calls, like qd_fish and such, and should be okay to release by quota_cleanup.

I can tell you this:

In the call trace above, function do_sync tried to lock an inode glock, which tried to instantiate it, and that caused a withdraw.
The thing is, the only inode glock used by do_sync is for the system quota inode. If it had a problem instantiating that, your file system is corrupt and needs to be run through fsck.gfs2. It could indicate a hardware problem reading the system quota dinode from the storage media.

If possible I'd like to know how you cause this problem to occur. What were you doing to get this to happen? And how can I recreate it?

GFS2 might have a problem with withdrawing during this sequence, but I don't think it has much to do with the sd_quota_bitmap.

Regards,

Bob Peterson
GFS2 File System