Re: [PATCH v3 1/2] x86_64,signal: Fix SS handling for signals delivered to 64-bit programs

From: Andy Lutomirski
Date: Thu Mar 19 2015 - 12:08:36 EST


On Thu, Mar 19, 2015 at 12:35 AM, Cyrill Gorcunov <gorcunov@xxxxxxxxx> wrote:
> On Wed, Mar 18, 2015 at 03:03:27PM -0700, Andy Lutomirski wrote:
>> On Wed, Mar 18, 2015 at 2:34 PM, Pavel Emelyanov <xemul@xxxxxxxxxxxxx> wrote:
>> > On 03/19/2015 12:26 AM, Andy Lutomirski wrote:
>> >> On Wed, Mar 18, 2015 at 1:02 PM, Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
>> >>> On 03/18, Andrey Wagin wrote:
>> >>>>
>> >>>> This patch fixes the problem. Oleg, could you send this path in the
>> >>>> criu maillist?
>> >>>
>> >>> Sure, will do.
>> >>
>> >> We still haven't answered one question: what's the kernel's position
>> >> on ABI stability wrt CRIU? We clearly shouldn't make changes that
>> >> break the principle of CRIU, but CRIU encodes so many tricky
>> >> assumptions about the inner workings of the kernel that it's really
>> >> tough to avoid breaking old CRIU versions.
>> >
>> > Well, we try hard to use only documented kernel API-s. Isn't the sigframe
>> > considered to be some sort of "stable API"? I mean -- it's visible by the
>> > userspace, nobody prevents glibc or gdb from messing with this stuff just
>> > by reading it from memory.
>> >
>> > If it's "parse-able" e.g. like VDSO is, but we don't do it in CRIU -- then
>> > it's definitely a CRIU BUG to be fixed.
>>
>> It's certainly parseable by things like gdb. But it's also supposed
>> to be extensible. hpa, any thoughts here?
>>
>> >
>> >> So... do we introduce somewhat nasty code into the kernel to keep old
>> >> CRIU versions working, or do we require that users who want to restore
>> >> onto new kernels use new CRIU?
>> >
>> > It's OK (I think) to require newer versions of CRIU, it's easy to update
>> > one unlike the kernel ;)
>> >
>> > But if "old" version of CRIU just crash the restored processes on "new"
>> > kernels and there's no way to detect this properly -- that's the problem.
>>
>> Yeah, that's unfortunate.
>>
>> I don't have a great idea for how to work around this, unfortunately.
>> Ideally we'd increment some kind of version counter or use an
>> extension mechanism rather than shoving ss into a field that used to
>> be padding.
>
> fwiw currently we're passing zero in this __pad0 (replying to your
> previous email, so we can workaround in the kernel assuming zero
> as a special case, not that good but better than nothing).

We could store ss_plus_one instead of ss. Yuck.

The only real down side I can see to special casing zero is that it
really is possible to end up with zero in there. For example, the
SIGSEGV you get do to the failed sigreturn probably has sigcontext->ss
== 0 :)

--Andy

>
>>
>> --Andy
>>
>> >
>> >> (It seems clear to me that CRIU should apply the patch regardless of
>> >> what the kernel does. It will enable CRIU to work on the same class
>> >> of programs that are fixed by the kernel change that started this
>> >> thread.)
>> >>
>> >> --Andy
>> >> .
>> >>
>> >
>> > Thanks,
>> > Pavel
>> >
>>
>>
>>
>> --
>> Andy Lutomirski
>> AMA Capital Management, LLC
>>
>
> Cyrill



--
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/