Re: [PATCH 0/2] shm: omit forced shm destroy if task IPC namespace was changed

From: Alexander Mihalicyn
Date: Mon Jul 12 2021 - 15:27:21 EST


Hi Eric,

On Mon, Jul 12, 2021 at 10:18 PM Eric W. Biederman
<ebiederm@xxxxxxxxxxxx> wrote:
>
> Alexander Mihalicyn <alexander@xxxxxxxxxxxxx> writes:
>
> > Hello Manfred,
> >
> > On Sun, Jul 11, 2021 at 2:47 PM Manfred Spraul <manfred@xxxxxxxxxxxxxxxx> wrote:
> >>
> >> Hi Alex,
> >>
> >>
> >> Am Sonntag, 11. Juli 2021 schrieb Alexander Mihalicyn <alexander@xxxxxxxxxxxxx>:
> >> >
> >> > Hi, Manfred,
> >> >
> >> > On Sun, Jul 11, 2021 at 12:13 PM Manfred Spraul
> >> > <manfred@xxxxxxxxxxxxxxxx> wrote:
> >> > >
> >> > > Hi,
> >> > >
> >> > >
> >> > > Am Samstag, 10. Juli 2021 schrieb Alexander Mihalicyn <alexander@xxxxxxxxxxxxx>:
> >> > >>
> >> > >>
> >> > >> Now, using setns() syscall, we can construct situation when on
> >> > >> task->sysvshm.shm_clist list
> >> > >> we have shm items from several (!) IPC namespaces.
> >> > >>
> >> > >>
> >> > > Does this imply that locking ist affected as well? According to the initial patch, accesses to shm_clist are protected by "the" IPC shm namespace rwsem. This can't work if the list contains objects from several namespaces.
> >> >
> >> > Of course, you are right. I've to rework this part -> I can add check into
> >> > static int newseg(struct ipc_namespace *ns, struct ipc_params *params)
> >> > function and before adding new shm into task list check that list is empty OR
> >> > an item which is present on the list from the same namespace as
> >> > current->nsproxy->ipc_ns.
> >> >
> >> Ok. (Sorry, I have only smartphone internet, thus I could not check
> >> the patch fully)
> >>
> >> > >> I've proposed a change which keeps the old behaviour of setns() but
> >> > >> fixes double free.
> >> > >>
> >> > > Assuming that locking works, I would consider this as a namespace design question: Do we want to support that a task contains shm objects from several ipc namespaces?
> >> >
> >> > This depends on what we mean by "task contains shm objects from
> >> > several ipc namespaces". There are two meanings:
> >> >
> >> > 1. Task has attached shm object from different ipc namespaces
> >> >
> >> > We already support that by design. When we doing a change of namespace
> >> > using unshare(CLONE_NEWIPC) even with
> >> > sysctl shm_rmid_forced=1 we not detach all ipc's from task!
> >>
> >> OK. Thus shm and sem have different behavior anyways.
> >>
> >> >
> >> > 2. Task task->sysvshm.shm_clist list has items from different IPC namespaces.
> >> >
> >> > I'm not sure, do we need that or not. But I'm ready to prepare a patch
> >> > for any of the options which we choose:
> >> > a) just add exit_shm(current)+shm_init_task(current);
> >> > b) prepare PATCHv2 with appropriate check in the newseg() to prevent
> >> > adding new items from different namespace to the list
> >> > c) rework algorithm so we can safely have items from different
> >> > namespaces in task->sysvshm.shm_clist
> >> >
> >> Before you write something, let's wait what the others say. I don't
> >> qualify AS shm expert
> >>
> >> a) is user space visible, without any good excuse
> >
> > yes, but maybe we decide that this is not so critical?
> > We need more people here :)
>
> It is barely visible. You have to do something very silly
> to see this happening. It is probably ok, but the work to
> verify that nothing cares so that we can safely backport
> the change is probably much more work than just updating
> the list to handle shmid's for multiple namespaces.
>
>
> >> c) is probably highest amount of Changes
> >
> > yep. but ok, I will prepare patches fast.
>
> Given that this is a bug I think c) is the safest option.
>
> A couple of suggestions.
> 1) We can replace the test "shm_creator != NULL" with
> "list_empty(&shp->shm_clist)" and remove shm_creator.
>
> Along with replacing "shm_creator = NULL" with
> "list_del_init(&shp->shm_clist)".
>
> 2) We can update shmat to do "list_del_init(&shp->shm_clist)"
> upon shmat. The last unmap will still shm_destroy the
> shm segment as ns->shm_rmid_forced is set.
>
> For a multi-threaded process I think this will nicely clean up
> the clist, and make it clear that the clist only cares about
> those segments that have been created but never attached.
>
> 3) Put a non-reference counted struct ipc_namespace in struct
> shmid_kernel, and use it to remove the namespace parameter
> from shm_destroy.

Thanks for your detailed suggestions! ;)
I will prepare a patch tomorrow for kernel + test what's happening with
CRIU and will prepare a fix for it.

>
> I think that is enough to fix this bug with no changes in semantics,
> no additional memory consumed, and an implementation that is easier
> to read and perhaps a little faster.
>
> Eric

Regards,
Alex