Re: [PATCH v2] kernel: release ptraced tasks before zap_pid_ns_processes

From: Oleg Nesterov
Date: Tue Feb 26 2019 - 10:30:46 EST


On 02/26, Jiri Slaby wrote:
>
> On 10. 01. 19, 18:52, Andrei Vagin wrote:
> > --- a/kernel/exit.c
> > +++ b/kernel/exit.c
> > @@ -558,12 +558,14 @@ static struct task_struct *find_alive_thread(struct task_struct *p)
> > return NULL;
> > }
> >
> > -static struct task_struct *find_child_reaper(struct task_struct *father)
> > +static struct task_struct *find_child_reaper(struct task_struct *father,
> > + struct list_head *dead)
> > __releases(&tasklist_lock)
> > __acquires(&tasklist_lock)
> > {
> > struct pid_namespace *pid_ns = task_active_pid_ns(father);
> > struct task_struct *reaper = pid_ns->child_reaper;
> > + struct task_struct *p, *n;
> >
> > if (likely(reaper != father))
> > return reaper;
> > @@ -579,6 +581,12 @@ static struct task_struct *find_child_reaper(struct task_struct *father)
> > panic("Attempted to kill init! exitcode=0x%08x\n",
> > father->signal->group_exit_code ?: father->exit_code);
> > }
> > +
> > + list_for_each_entry_safe(p, n, dead, ptrace_entry) {
> > + list_del_init(&p->ptrace_entry);
> > + release_task(p);
> > + }
> > +
>
> Hi,
>
> from our (SUSE) QA we received a report that this patch causes a
> performance decline in libmicro pthread_* benchmark as reported in:
> https://bugzilla.suse.com/show_bug.cgi?id=1126762

Access Denied

> I tried myself from the repo:
> https://github.com/redhat-performance/libMicro
>
> I ran
> pthread_create -B 8 -C 200 -S
>
> and with the patch applied:
> # STATISTICS usecs/call (raw) usecs/call (outliers removed)
> # mean 23.38611 17.29311
>
> Without:
> # mean 41.36539 39.21347

can't reproduce, I see the same numbers with or without this patch.
However, I did "./bin/pthread_create -B 8 -C 200 -S" under KVM.

> The benchmark seems to create 8 (-B above) pthreads, does lock/unlock in
> them and then the threads exit. The benchmark reaps the threads via
> pthread_join. This all happens 200 times (-C above).

Given that this test-case doesn't use CLONE_PID, I fail to understand how
this patch can make any noticeable difference performance wise...

with this patch forget_original_parent() just passes the additional argument
to find_child_reaper(), nothing else.

The extra list_for_each_entry_safe/release_task loop can't happen, and even
if it could it shouldn't cause any performance regression too.

> Any idea how to restore the performance close to the previous state?

maybe you can try perf to find out where does this difference come from?

Oleg.