Re: [PATCH 0/2] shm: omit forced shm destroy if task IPC namespace was changed
From: Eric W. Biederman
Date: Mon Jul 12 2021 - 15:18:39 EST
Alexander Mihalicyn <alexander@xxxxxxxxxxxxx> writes:
> Hello Manfred,
> On Sun, Jul 11, 2021 at 2:47 PM Manfred Spraul <manfred@xxxxxxxxxxxxxxxx> wrote:
>> Hi Alex,
>> Am Sonntag, 11. Juli 2021 schrieb Alexander Mihalicyn <alexander@xxxxxxxxxxxxx>:
>> > Hi, Manfred,
>> > On Sun, Jul 11, 2021 at 12:13 PM Manfred Spraul
>> > <manfred@xxxxxxxxxxxxxxxx> wrote:
>> > >
>> > > Hi,
>> > >
>> > >
>> > > Am Samstag, 10. Juli 2021 schrieb Alexander Mihalicyn <alexander@xxxxxxxxxxxxx>:
>> > >>
>> > >>
>> > >> Now, using setns() syscall, we can construct situation when on
>> > >> task->sysvshm.shm_clist list
>> > >> we have shm items from several (!) IPC namespaces.
>> > >>
>> > >>
>> > > Does this imply that locking ist affected as well? According to the initial patch, accesses to shm_clist are protected by "the" IPC shm namespace rwsem. This can't work if the list contains objects from several namespaces.
>> > Of course, you are right. I've to rework this part -> I can add check into
>> > static int newseg(struct ipc_namespace *ns, struct ipc_params *params)
>> > function and before adding new shm into task list check that list is empty OR
>> > an item which is present on the list from the same namespace as
>> > current->nsproxy->ipc_ns.
>> Ok. (Sorry, I have only smartphone internet, thus I could not check
>> the patch fully)
>> > >> I've proposed a change which keeps the old behaviour of setns() but
>> > >> fixes double free.
>> > >>
>> > > Assuming that locking works, I would consider this as a namespace design question: Do we want to support that a task contains shm objects from several ipc namespaces?
>> > This depends on what we mean by "task contains shm objects from
>> > several ipc namespaces". There are two meanings:
>> > 1. Task has attached shm object from different ipc namespaces
>> > We already support that by design. When we doing a change of namespace
>> > using unshare(CLONE_NEWIPC) even with
>> > sysctl shm_rmid_forced=1 we not detach all ipc's from task!
>> OK. Thus shm and sem have different behavior anyways.
>> > 2. Task task->sysvshm.shm_clist list has items from different IPC namespaces.
>> > I'm not sure, do we need that or not. But I'm ready to prepare a patch
>> > for any of the options which we choose:
>> > a) just add exit_shm(current)+shm_init_task(current);
>> > b) prepare PATCHv2 with appropriate check in the newseg() to prevent
>> > adding new items from different namespace to the list
>> > c) rework algorithm so we can safely have items from different
>> > namespaces in task->sysvshm.shm_clist
>> Before you write something, let's wait what the others say. I don't
>> qualify AS shm expert
>> a) is user space visible, without any good excuse
> yes, but maybe we decide that this is not so critical?
> We need more people here :)
It is barely visible. You have to do something very silly
to see this happening. It is probably ok, but the work to
verify that nothing cares so that we can safely backport
the change is probably much more work than just updating
the list to handle shmid's for multiple namespaces.
>> c) is probably highest amount of Changes
> yep. but ok, I will prepare patches fast.
Given that this is a bug I think c) is the safest option.
A couple of suggestions.
1) We can replace the test "shm_creator != NULL" with
"list_empty(&shp->shm_clist)" and remove shm_creator.
Along with replacing "shm_creator = NULL" with
2) We can update shmat to do "list_del_init(&shp->shm_clist)"
upon shmat. The last unmap will still shm_destroy the
shm segment as ns->shm_rmid_forced is set.
For a multi-threaded process I think this will nicely clean up
the clist, and make it clear that the clist only cares about
those segments that have been created but never attached.
3) Put a non-reference counted struct ipc_namespace in struct
shmid_kernel, and use it to remove the namespace parameter
I think that is enough to fix this bug with no changes in semantics,
no additional memory consumed, and an implementation that is easier
to read and perhaps a little faster.