Re: Linux 5.18-rc4
From: Linus Torvalds
Date: Mon Jun 06 2022 - 14:28:51 EST
On Mon, Jun 6, 2022 at 8:19 AM Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote:
> Has anyone looked into this lock ordering issues?
The deadlock is
> >> [78140.503821] CPU0 CPU1
> >> [78140.503823] ---- ----
> >> [78140.503824] lock(&newf->file_lock);
> >> [78140.503826] lock(&p->alloc_lock);
> >> [78140.503828] lock(&newf->file_lock);
> >> [78140.503830] lock(&ctx->lock);
and the alloc_lock -> file_lock on CPU1 is trivial - it's seq_show()
in fs/proc/fd.c:
task_lock(task);
files = task->files;
if (files) {
unsigned int fd = proc_fd(m->private);
spin_lock(&files->file_lock);
and that looks all normal.
But the other chains look painful.
I do see the IPC code doing ugly things, in particular I detest this code:
task_lock(current);
list_add(&shp->shm_clist, ¤t->sysvshm.shm_clist);
task_unlock(current);
where it is using the task lock to protect the shm_clist list. Nasty.
And it's doing that inside the shm_ids.rwsem lock _and_ inside the
shp->shm_perm.lock.
So the IPC code has newseg() doing
shmget ->
ipcget():
down_write(ids->rwsem) ->
newseg():
ipc_addid gets perm->lock
task_lock(current)
so you have
ids->rwsem -> perm->lock -> alloc_lock
there.
So now we have that
ids->rwsem -> ipcperm->lock -> alloc_lock -> file_lock
when you put those sequences together.
But I didn't figure out what the security subsystem angle is and how
that then apparently mixes things up with execve.
Yes, newseg() is doing that
error = security_shm_alloc(&shp->shm_perm);
while holding rwsem, but I can't see how that matters. From the
lockdep output, rwsem doesn't actually seem to be part of the whole
sequence.
It *looks* like we have
apparmour ctx->lock -->
radix_tree_preloads.lock -->
ipcperm->lock
and apparently that's called under the file_lock somewhere, completing
the circle.
I guess the execve component is that
begin_new_exec ->
security_bprm_committing_creds ->
apparmor_bprm_committing_creds ->
aa_inherit_files ->
iterate_fd -> *takes file_lock*
match_file ->
aa_file_perm ->
update_file_ctx *takes ctx->lock*
so that's how you get file_lock -> ctx->lock.
So you have:
SHMGET:
ipcperm->lock -> alloc_lock
/proc:
alloc_lock -> file_lock
apparmor_bprm_committing_creds:
file_lock -> ctx->lock
and then all you need is ctx->lock -> ipcperm->lock but I didn't find that part.
I suspect that part is that both Apparmor and IPC use the idr local lock.
Linus