Re: [PATCH 0/2] shm: omit forced shm destroy if task IPC namespace was changed
From: Alexander Mihalicyn
Date: Wed Jul 14 2021 - 13:42:56 EST
Hi Eric,
I've sent the RFC patch "shm: extend forced shm destroy to support
objects from", please take a look once time permits.
A few words about design:
1. In the exit_shm() function I'm using a temporary list to collect
shm's before destroying because I can't take semaphore (which protects
ns idr structure) under spinlock (which protects list on task)
2. I've to take IPC ns refcounter in exit_shm() to prevent possible
race with shm_exit_ns().
3. I haven't added list_del_init(&shp->shm_clist) into shmat() syscall
because I can't understand why this is safe. But maybe I've to think
more about that.
4. I need to remove the extra "ns" argument (todo).
Thanks,
Alex
On Mon, Jul 12, 2021 at 10:27 PM Alexander Mihalicyn
<alexander@xxxxxxxxxxxxx> wrote:
>
> Hi Eric,
>
> On Mon, Jul 12, 2021 at 10:18 PM Eric W. Biederman
> <ebiederm@xxxxxxxxxxxx> wrote:
> >
> > Alexander Mihalicyn <alexander@xxxxxxxxxxxxx> writes:
> >
> > > Hello Manfred,
> > >
> > > On Sun, Jul 11, 2021 at 2:47 PM Manfred Spraul <manfred@xxxxxxxxxxxxxxxx> wrote:
> > >>
> > >> Hi Alex,
> > >>
> > >>
> > >> Am Sonntag, 11. Juli 2021 schrieb Alexander Mihalicyn <alexander@xxxxxxxxxxxxx>:
> > >> >
> > >> > Hi, Manfred,
> > >> >
> > >> > On Sun, Jul 11, 2021 at 12:13 PM Manfred Spraul
> > >> > <manfred@xxxxxxxxxxxxxxxx> wrote:
> > >> > >
> > >> > > Hi,
> > >> > >
> > >> > >
> > >> > > Am Samstag, 10. Juli 2021 schrieb Alexander Mihalicyn <alexander@xxxxxxxxxxxxx>:
> > >> > >>
> > >> > >>
> > >> > >> Now, using setns() syscall, we can construct situation when on
> > >> > >> task->sysvshm.shm_clist list
> > >> > >> we have shm items from several (!) IPC namespaces.
> > >> > >>
> > >> > >>
> > >> > > Does this imply that locking ist affected as well? According to the initial patch, accesses to shm_clist are protected by "the" IPC shm namespace rwsem. This can't work if the list contains objects from several namespaces.
> > >> >
> > >> > Of course, you are right. I've to rework this part -> I can add check into
> > >> > static int newseg(struct ipc_namespace *ns, struct ipc_params *params)
> > >> > function and before adding new shm into task list check that list is empty OR
> > >> > an item which is present on the list from the same namespace as
> > >> > current->nsproxy->ipc_ns.
> > >> >
> > >> Ok. (Sorry, I have only smartphone internet, thus I could not check
> > >> the patch fully)
> > >>
> > >> > >> I've proposed a change which keeps the old behaviour of setns() but
> > >> > >> fixes double free.
> > >> > >>
> > >> > > Assuming that locking works, I would consider this as a namespace design question: Do we want to support that a task contains shm objects from several ipc namespaces?
> > >> >
> > >> > This depends on what we mean by "task contains shm objects from
> > >> > several ipc namespaces". There are two meanings:
> > >> >
> > >> > 1. Task has attached shm object from different ipc namespaces
> > >> >
> > >> > We already support that by design. When we doing a change of namespace
> > >> > using unshare(CLONE_NEWIPC) even with
> > >> > sysctl shm_rmid_forced=1 we not detach all ipc's from task!
> > >>
> > >> OK. Thus shm and sem have different behavior anyways.
> > >>
> > >> >
> > >> > 2. Task task->sysvshm.shm_clist list has items from different IPC namespaces.
> > >> >
> > >> > I'm not sure, do we need that or not. But I'm ready to prepare a patch
> > >> > for any of the options which we choose:
> > >> > a) just add exit_shm(current)+shm_init_task(current);
> > >> > b) prepare PATCHv2 with appropriate check in the newseg() to prevent
> > >> > adding new items from different namespace to the list
> > >> > c) rework algorithm so we can safely have items from different
> > >> > namespaces in task->sysvshm.shm_clist
> > >> >
> > >> Before you write something, let's wait what the others say. I don't
> > >> qualify AS shm expert
> > >>
> > >> a) is user space visible, without any good excuse
> > >
> > > yes, but maybe we decide that this is not so critical?
> > > We need more people here :)
> >
> > It is barely visible. You have to do something very silly
> > to see this happening. It is probably ok, but the work to
> > verify that nothing cares so that we can safely backport
> > the change is probably much more work than just updating
> > the list to handle shmid's for multiple namespaces.
> >
> >
> > >> c) is probably highest amount of Changes
> > >
> > > yep. but ok, I will prepare patches fast.
> >
> > Given that this is a bug I think c) is the safest option.
> >
> > A couple of suggestions.
> > 1) We can replace the test "shm_creator != NULL" with
> > "list_empty(&shp->shm_clist)" and remove shm_creator.
> >
> > Along with replacing "shm_creator = NULL" with
> > "list_del_init(&shp->shm_clist)".
> >
> > 2) We can update shmat to do "list_del_init(&shp->shm_clist)"
> > upon shmat. The last unmap will still shm_destroy the
> > shm segment as ns->shm_rmid_forced is set.
> >
> > For a multi-threaded process I think this will nicely clean up
> > the clist, and make it clear that the clist only cares about
> > those segments that have been created but never attached.
> >
> > 3) Put a non-reference counted struct ipc_namespace in struct
> > shmid_kernel, and use it to remove the namespace parameter
> > from shm_destroy.
>
> Thanks for your detailed suggestions! ;)
> I will prepare a patch tomorrow for kernel + test what's happening with
> CRIU and will prepare a fix for it.
>
> >
> > I think that is enough to fix this bug with no changes in semantics,
> > no additional memory consumed, and an implementation that is easier
> > to read and perhaps a little faster.
> >
> > Eric
>
> Regards,
> Alex