Re: [PATCH 1/3] ima: Remove inode lock

From: Roberto Sassu
Date: Fri Oct 11 2024 - 09:31:07 EST


On Wed, 2024-10-09 at 18:43 +0200, Roberto Sassu wrote:
> On Wed, 2024-10-09 at 18:25 +0200, Roberto Sassu wrote:
> > On Wed, 2024-10-09 at 11:37 -0400, Paul Moore wrote:
> > > On Wed, Oct 9, 2024 at 11:36 AM Paul Moore <paul@xxxxxxxxxxxxxx> wrote:
> > > > On Tue, Oct 8, 2024 at 12:57 PM Roberto Sassu
> > > > <roberto.sassu@xxxxxxxxxxxxxxx> wrote:
> > > > >
> > > > > From: Roberto Sassu <roberto.sassu@xxxxxxxxxx>
> > > > >
> > > > > Move out the mutex in the ima_iint_cache structure to a new structure
> > > > > called ima_iint_cache_lock, so that a lock can be taken regardless of
> > > > > whether or not inode integrity metadata are stored in the inode.
> > > > >
> > > > > Introduce ima_inode_security() to simplify accessing the new structure in
> > > > > the inode security blob.
> > > > >
> > > > > Move the mutex initialization and annotation in the new function
> > > > > ima_inode_alloc_security() and introduce ima_iint_lock() and
> > > > > ima_iint_unlock() to respectively lock and unlock the mutex.
> > > > >
> > > > > Finally, expand the critical region in process_measurement() guarded by
> > > > > iint->mutex up to where the inode was locked, use only one iint lock in
> > > > > __ima_inode_hash(), since the mutex is now in the inode security blob, and
> > > > > replace the inode_lock()/inode_unlock() calls in ima_check_last_writer().
> > > > >
> > > > > Signed-off-by: Roberto Sassu <roberto.sassu@xxxxxxxxxx>
> > > > > ---
> > > > > security/integrity/ima/ima.h | 26 ++++++++---
> > > > > security/integrity/ima/ima_api.c | 4 +-
> > > > > security/integrity/ima/ima_iint.c | 77 ++++++++++++++++++++++++++-----
> > > > > security/integrity/ima/ima_main.c | 39 +++++++---------
> > > > > 4 files changed, 104 insertions(+), 42 deletions(-)
> > > >
> > > > I'm not an IMA expert, but it looks reasonable to me, although
> > > > shouldn't this carry a stable CC in the patch metadata?
> > > >
> > > > Reviewed-by: Paul Moore <paul@xxxxxxxxxxxxxx>
> > >
> > > Sorry, one more thing ... did you verify this patchset resolves the
> > > syzbot problem? I saw at least one reproducer.
> >
> > Uhm, could not reproduce the deadlock with the reproducer. However,
> > without the patch I have a lockdep warning, and with I don't.
> >
> > I asked syzbot to try the patches. Let's see.
>
> I actually got a different lockdep warning:
>
> [ 904.603365] ======================================================
> [ 904.604264] WARNING: possible circular locking dependency detected
> [ 904.605141] 6.12.0-rc2+ #20 Not tainted
> [ 904.605697] ------------------------------------------------------

I can reproduce by executing the syzbot reproducer and in another
terminal by logging in with SSH (not all the times).

If I understood what the lockdep warning means, this is the scenario.

A task accesses a seq_file which is in the IMA policy, so we enter the
critical region guarded by the iint lock. But before we get the chance
to measure the file, a second task calls remap_file_pages() on the same
seq_file.

remap_file_pages() takes the mmap lock and, if the event matches the
IMA policy too, the second task waits for the iint lock to be released.

Now, the first task starts to measure the seq_file and takes the
seq_file lock. I don't know if 3 processes must be involved, but I was
thinking that reading the seq_file from the first task can trigger a
page fault, which requires to take the mmap lock.

At this point, we reach a deadlock. The first task waits for the mmap
lock to be released, and the second waits for the iint lock to be
released, which both cannot happen.

Roberto

> [ 904.606577] systemd/1 is trying to acquire lock:
> [ 904.607227] ffff88810e5c2580 (&p->lock){+.+.}-{4:4}, at: seq_read_iter+0x62/0x6b0
> [ 904.608290]
> [ 904.608290] but task is already holding lock:
> [ 904.609105] ffff88810f4abf20 (&ima_iint_lock_mutex_key[depth]){+.+.}-{4:4}, at: ima_iint_lock+0x24/0x40
> [ 904.610429]
> [ 904.610429] which lock already depends on the new lock.
> [ 904.610429]
> [ 904.611574]
> [ 904.611574] the existing dependency chain (in reverse order) is:
> [ 904.612628]
> [ 904.612628] -> #2 (&ima_iint_lock_mutex_key[depth]){+.+.}-{4:4}:
> [ 904.613681] __mutex_lock+0xaf/0x760
> [ 904.614266] mutex_lock_nested+0x27/0x40
> [ 904.614897] ima_iint_lock+0x24/0x40
> [ 904.615490] process_measurement+0x176/0xef0
> [ 904.616168] ima_file_mmap+0x98/0x120
> [ 904.616767] security_mmap_file+0x408/0x560
> [ 904.617444] __do_sys_remap_file_pages+0x2fa/0x4c0
> [ 904.618194] __x64_sys_remap_file_pages+0x29/0x40
> [ 904.618937] x64_sys_call+0x6e8/0x4550
> [ 904.619546] do_syscall_64+0x71/0x180
> [ 904.620155] entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [ 904.620952]
> [ 904.620952] -> #1 (&mm->mmap_lock){++++}-{4:4}:
> [ 904.621813] __might_fault+0x6f/0xb0
> [ 904.622400] _copy_to_iter+0x12e/0xa80
> [ 904.623009] seq_read_iter+0x593/0x6b0
> [ 904.623629] proc_reg_read_iter+0x31/0xe0
> [ 904.624276] vfs_read+0x256/0x3d0
> [ 904.624822] ksys_read+0x6d/0x160
> [ 904.625372] __x64_sys_read+0x1d/0x30
> [ 904.625964] x64_sys_call+0x1068/0x4550
> [ 904.626594] do_syscall_64+0x71/0x180
> [ 904.627188] entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [ 904.627975]
> [ 904.627975] -> #0 (&p->lock){+.+.}-{4:4}:
> [ 904.628787] __lock_acquire+0x17f3/0x2320
> [ 904.629432] lock_acquire+0xf2/0x420
> [ 904.630013] __mutex_lock+0xaf/0x760
> [ 904.630596] mutex_lock_nested+0x27/0x40
> [ 904.631225] seq_read_iter+0x62/0x6b0
> [ 904.631831] kernfs_fop_read_iter+0x1ef/0x2c0
> [ 904.632599] __kernel_read+0x113/0x350
> [ 904.633206] integrity_kernel_read+0x23/0x40
> [ 904.633902] ima_calc_file_hash_tfm+0x14e/0x230
> [ 904.634621] ima_calc_file_hash+0x97/0x250
> [ 904.635281] ima_collect_measurement+0x4be/0x530
> [ 904.636008] process_measurement+0x7c0/0xef0
> [ 904.636689] ima_file_check+0x65/0x80
> [ 904.637295] security_file_post_open+0xb1/0x1b0
> [ 904.638008] path_openat+0x216/0x1280
> [ 904.638605] do_filp_open+0xab/0x140
> [ 904.639185] do_sys_openat2+0xba/0x120
> [ 904.639805] do_sys_open+0x4c/0x80
> [ 904.640366] __x64_sys_openat+0x23/0x30
> [ 904.640992] x64_sys_call+0x2575/0x4550
> [ 904.641616] do_syscall_64+0x71/0x180
> [ 904.642207] entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [ 904.643003]
> [ 904.643003] other info that might help us debug this:
> [ 904.643003]
> [ 904.644149] Chain exists of:
> [ 904.644149] &p->lock --> &mm->mmap_lock --> &ima_iint_lock_mutex_key[depth]
> [ 904.644149]
> [ 904.645763] Possible unsafe locking scenario:
> [ 904.645763]
> [ 904.646614] CPU0 CPU1
> [ 904.647264] ---- ----
> [ 904.647909] lock(&ima_iint_lock_mutex_key[depth]);
> [ 904.648617] lock(&mm->mmap_lock);
> [ 904.649479] lock(&ima_iint_lock_mutex_key[depth]);
> [ 904.650543] lock(&p->lock);
> [ 904.650974]
> [ 904.650974] *** DEADLOCK ***
> [ 904.650974]
> [ 904.651826] 1 lock held by systemd/1:
> [ 904.652376] #0: ffff88810f4abf20 (&ima_iint_lock_mutex_key[depth]){+.+.}-{4:4}, at: ima_iint_lock+0x24/0x40
> [ 904.653759]
> [ 904.653759] stack backtrace:
> [ 904.654391] CPU: 2 UID: 0 PID: 1 Comm: systemd Not tainted 6.12.0-rc2+ #20
> [ 904.655360] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
> [ 904.656497] Call Trace:
> [ 904.656856] <TASK>
> [ 904.657166] dump_stack_lvl+0x134/0x1a0
> [ 904.657728] dump_stack+0x14/0x30
> [ 904.658206] print_circular_bug+0x38d/0x450
> [ 904.658812] check_noncircular+0xed/0x120
> [ 904.659396] ? srso_return_thunk+0x5/0x5f
> [ 904.659972] ? srso_return_thunk+0x5/0x5f
> [ 904.660569] __lock_acquire+0x17f3/0x2320
> [ 904.661145] lock_acquire+0xf2/0x420
> [ 904.661664] ? seq_read_iter+0x62/0x6b0
> [ 904.662217] ? srso_return_thunk+0x5/0x5f
> [ 904.662886] __mutex_lock+0xaf/0x760
> [ 904.663408] ? seq_read_iter+0x62/0x6b0
> [ 904.663961] ? seq_read_iter+0x62/0x6b0
> [ 904.664525] ? srso_return_thunk+0x5/0x5f
> [ 904.665098] ? mark_lock+0x4e/0x750
> [ 904.665610] ? mutex_lock_nested+0x27/0x40
> [ 904.666194] ? find_held_lock+0x3a/0x100
> [ 904.666770] mutex_lock_nested+0x27/0x40
> [ 904.667337] seq_read_iter+0x62/0x6b0
> [ 904.667869] kernfs_fop_read_iter+0x1ef/0x2c0
> [ 904.668536] __kernel_read+0x113/0x350
> [ 904.669079] integrity_kernel_read+0x23/0x40
> [ 904.669698] ima_calc_file_hash_tfm+0x14e/0x230
> [ 904.670349] ? __lock_acquire+0xc32/0x2320
> [ 904.670937] ? srso_return_thunk+0x5/0x5f
> [ 904.671525] ? __lock_acquire+0xfbb/0x2320
> [ 904.672113] ? srso_return_thunk+0x5/0x5f
> [ 904.672693] ? srso_return_thunk+0x5/0x5f
> [ 904.673280] ? lock_acquire+0xf2/0x420
> [ 904.673818] ? kernfs_iop_getattr+0x4a/0xb0
> [ 904.674424] ? srso_return_thunk+0x5/0x5f
> [ 904.674997] ? find_held_lock+0x3a/0x100
> [ 904.675564] ? srso_return_thunk+0x5/0x5f
> [ 904.676156] ? srso_return_thunk+0x5/0x5f
> [ 904.676740] ? srso_return_thunk+0x5/0x5f
> [ 904.677322] ? local_clock_noinstr+0x9/0xb0
> [ 904.677923] ? srso_return_thunk+0x5/0x5f
> [ 904.678502] ? srso_return_thunk+0x5/0x5f
> [ 904.679075] ? lock_release+0x4e2/0x570
> [ 904.679639] ima_calc_file_hash+0x97/0x250
> [ 904.680227] ima_collect_measurement+0x4be/0x530
> [ 904.680901] ? srso_return_thunk+0x5/0x5f
> [ 904.681496] ? srso_return_thunk+0x5/0x5f
> [ 904.682070] ? __kernfs_iattrs+0x4a/0x140
> [ 904.682658] ? srso_return_thunk+0x5/0x5f
> [ 904.683242] ? process_measurement+0x7c0/0xef0
> [ 904.683876] ? srso_return_thunk+0x5/0x5f
> [ 904.684462] process_measurement+0x7c0/0xef0
> [ 904.685078] ? srso_return_thunk+0x5/0x5f
> [ 904.685654] ? srso_return_thunk+0x5/0x5f
> [ 904.686228] ? _raw_spin_unlock_irqrestore+0x5d/0xd0
> [ 904.686938] ? srso_return_thunk+0x5/0x5f
> [ 904.687523] ? srso_return_thunk+0x5/0x5f
> [ 904.688098] ? srso_return_thunk+0x5/0x5f
> [ 904.688672] ? local_clock_noinstr+0x9/0xb0
> [ 904.689273] ? srso_return_thunk+0x5/0x5f
> [ 904.689846] ? srso_return_thunk+0x5/0x5f
> [ 904.690430] ? srso_return_thunk+0x5/0x5f
> [ 904.691005] ? srso_return_thunk+0x5/0x5f
> [ 904.691583] ? srso_return_thunk+0x5/0x5f
> [ 904.692180] ? local_clock_noinstr+0x9/0xb0
> [ 904.692841] ? srso_return_thunk+0x5/0x5f
> [ 904.693419] ? srso_return_thunk+0x5/0x5f
> [ 904.693990] ? lock_release+0x4e2/0x570
> [ 904.694544] ? srso_return_thunk+0x5/0x5f
> [ 904.695115] ? kernfs_put_active+0x5d/0xc0
> [ 904.695708] ? srso_return_thunk+0x5/0x5f
> [ 904.696286] ? kernfs_fop_open+0x376/0x6b0
> [ 904.696872] ima_file_check+0x65/0x80
> [ 904.697409] security_file_post_open+0xb1/0x1b0
> [ 904.698058] path_openat+0x216/0x1280
> [ 904.698589] do_filp_open+0xab/0x140
> [ 904.699106] ? srso_return_thunk+0x5/0x5f
> [ 904.699693] ? lock_release+0x554/0x570
> [ 904.700264] ? srso_return_thunk+0x5/0x5f
> [ 904.700836] ? do_raw_spin_unlock+0x76/0x140
> [ 904.701450] ? srso_return_thunk+0x5/0x5f
> [ 904.702021] ? _raw_spin_unlock+0x3f/0xa0
> [ 904.702606] ? srso_return_thunk+0x5/0x5f
> [ 904.703178] ? alloc_fd+0x1ca/0x3b0
> [ 904.703685] do_sys_openat2+0xba/0x120
> [ 904.704223] ? file_free+0x8d/0x110
> [ 904.704729] do_sys_open+0x4c/0x80
> [ 904.705221] __x64_sys_openat+0x23/0x30
> [ 904.705784] x64_sys_call+0x2575/0x4550
> [ 904.706337] do_syscall_64+0x71/0x180
> [ 904.706863] entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [ 904.707587] RIP: 0033:0x7f3462123037
> [ 904.708120] Code: 55 9c 48 89 75 a0 89 7d a8 44 89 55 ac e8 a1 7a f8 ff 44 8b 55 ac 8b 55 9c 41 89 c0 48 8b 75 a0 8b 7d a8 b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 89 45 ac e8 f6 7a f8 ff 8b 45 ac
> [ 904.710744] RSP: 002b:00007ffdd1a79370 EFLAGS: 00000293 ORIG_RAX: 0000000000000101
> [ 904.711821] RAX: ffffffffffffffda RBX: ffffffffffffffff RCX: 00007f3462123037
> [ 904.712829] RDX: 0000000000080100 RSI: 00007ffdd1a79420 RDI: 00000000ffffff9c
> [ 904.713839] RBP: 00007ffdd1a793e0 R08: 0000000000000000 R09: 0000000000000000
> [ 904.714848] R10: 0000000000000000 R11: 0000000000000293 R12: 00007ffdd1a794c8
> [ 904.715855] R13: 00007ffdd1a794b8 R14: 00007ffdd1a79690 R15: 00007ffdd1a79690
> [ 904.716877] </TASK>
>
> Roberto