Re: perf_event_open+clone = unkillable process

From: Eric W. Biederman
Date: Tue Feb 05 2019 - 01:07:36 EST

Next message: Subrahmanya Lingappa: "Re: [PATCHv3 09/27] PCI: mobiveil: correct inbound/outbound window setup routines"
Previous message: Subrahmanya Lingappa: "Re: [PATCHv3 08/27] PCI: mobiveil: use the 1st inbound window for MEM inbound transactions"
In reply to: Eric W. Biederman: "Re: perf_event_open+clone = unkillable process"
Next in thread: Eric W. Biederman: "[RFC][PATCH] signal: Store pending signal exit in tsk.jobctl not in tsk.pending"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

ebiederm@xxxxxxxxxxxx (Eric W. Biederman) writes:

> ebiederm@xxxxxxxxxxxx (Eric W. Biederman) writes:
>
>> Thomas Gleixner <tglx@xxxxxxxxxxxxx> writes:
>>
>>> On Mon, 4 Feb 2019, Dmitry Vyukov wrote:
>>>
>>>> On Mon, Feb 4, 2019 at 10:27 AM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>>>> >
>>>> > On Fri, 1 Feb 2019, Dmitry Vyukov wrote:
>>>> >
>>>> > > On Fri, Feb 1, 2019 at 5:48 PM Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
>>>> > > >
>>>> > > > Hello,
>>>> > > >
>>>> > > > The following program creates an unkillable process that eats CPU.
>>>> > > > /proc/pid/stack is empty, I am not sure what other info I can provide.
>>>> > > >
>>>> > > > Tested is on upstream commit 4aa9fc2a435abe95a1e8d7f8c7b3d6356514b37a.
>>>> > > > Config is attached.
>>>> > >
>>>> > > Looking through other reproducers that create unkillable processes, I
>>>> > > think I found a much simpler reproducer (below). It's single threaded
>>>> > > and just setups SIGBUS handler and does timer_create+timer_settime to
>>>> > > send repeated SIGBUS. The resulting process can't be killed with
>>>> > > SIGKILL.
>>>> > > +Thomas for timers.
>>>> >
>>>> > +Oleg, Eric
>>>> >
>>>> > That's odd. With some tracing I can see that SIGKILL is generated and
>>>> > queued, but its not delivered by some weird reason. I'm traveling in the
>>>> > next days, so I won't be able to do much about it. Will look later this
>>>> > week.
>>>>
>>>> Just a random though looking at the repro: can constant SIGBUS
>>>> delivery starve delivery of all other signals (incl SIGKILL)?
>>>
>>> Indeed. SIGBUS is 7, SIGKILL is 9 and next_signal() delivers the lowest
>>> number first....
>>
>> We do have the special case in complete_signal that causes most of the
>> signal delivery work of SIGKILL to happen when SIGKILL is queued.
>>
>> I need to look at your reproducer. It would require being a per-thread
>> signal to cause problems in next_signal.
>>
>> It is definitely worth fixing if there is any way for userspace to block
>> SIGKILL.
>
> Ugh.
>
> The practical problem appears much worse.
>
> Tracing the code I see that we attempt to deliver SIGBUS, I presume in a
> per thread way.
>
> At some point the delivery of SIGBUS fails. Then the kernel attempts
> to synchronously force SIGSEGV. Which should be the end of it.
>
> Unfortunately at that point our heuristic for dealing with syncrhonous
> signals fails in next_signal and we attempt to deliver the timers
> SIGBUS instead.
>
> I suspect it is time to byte the bullet and handle the synchronous
> unblockable signals differently. I will see if I can cook up an
> appropriate patch.

Playing with this some more what I see happening is:

SIGHUP and SIGSEGV get directed at sighup_handler.
Timer delivers SIGHUP
sighup_handler starts.
timer delivers SIGHUP (before sighup_handler finishes)
sighup_handler starts.
timer delivers SIGHUP (before sighup_handler finishes)
sighup_handler starts.
timer delivers SIGHUP (before sighup_handler finishes)
sighup_handler starts.
....
Up until the stack is full.
Then:
timer delivers SIGHUP
sighup_handler won't start
Attempt force_sigsegv
Confused kernel attempts to deliver SIGHUP (instead of a synchronous SIGSEGV)

If you unconfuse the kernel there is an attempt to deliver SIGSEGV
(the stack is full)
Then the kernel changes the SIGSEGV handler to SIG_DFL
Then SIGSEGV is successfully delivered terminating the application

Which suggests 2 fixes.
1) making SIGKILL something that can't get hidden behind other signals.
2) Having a 3rd queue of signals for the synchronous signals.
So that the synchronous signals can't be blocked by per thread signals.

I have prototyped the 2nd one and it is enough to stop the infinite spin
that causes problems here when the process stack fills up.

Fixing SIGKILL will probably bring more benefits.

Eric

Next message: Subrahmanya Lingappa: "Re: [PATCHv3 09/27] PCI: mobiveil: correct inbound/outbound window setup routines"
Previous message: Subrahmanya Lingappa: "Re: [PATCHv3 08/27] PCI: mobiveil: use the 1st inbound window for MEM inbound transactions"
In reply to: Eric W. Biederman: "Re: perf_event_open+clone = unkillable process"
Next in thread: Eric W. Biederman: "[RFC][PATCH] signal: Store pending signal exit in tsk.jobctl not in tsk.pending"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]