Re: sigaltstack breaks swapcontext()

From: Stas Sergeev
Date: Wed Jan 06 2016 - 14:32:45 EST


06.01.2016 22:14, Andy Lutomirski ÐÐÑÐÑ:
On Wed, Jan 6, 2016 at 10:42 AM, Stas Sergeev <stsp@xxxxxxx> wrote:
06.01.2016 21:05, Andy Lutomirski ÐÐÑÐÑ:
On Wed, Jan 6, 2016 at 7:45 AM, Stas Sergeev <stsp@xxxxxxx> wrote:
Hello.

swapcontext() can be used with signal handlers,
it swaps the signal masks together with the other
parts of the context.
Unfortunately, linux implements the sigaltstack()
in a way that makes it impossible to use with
swapcontext().
Per the man page, sigaltstack is allowed to return
EPERM if the process is altering its sigaltstack while
running on sigaltstack. This is likely needed to
consistently return oss->ss_flags, that indicates
whether the process is being on sigaltstack or not.
Unfortunately, linux takes that permission to return
EPERM too literally: it returns EPERM even if you
don't want to change to another sigaltstack, but
only want to disable sigaltstack with SS_DISABLE.
To my reading of a man page, this is not a desired
behaviour. Moreover, you can't use swapcontext()
without disabling sigaltstack first, or the stack will
be re-used and overwritten by a subsequent signal.

The EPERM thing is probably also to preserve the behavior that nested
SA_ONSTACK signals are supposed to work.
Could you please clarify?
If I set up another stack inside the sighandler, the
nested SA_ONSTACK signal can just use that new stack,
which seems safe and sane. So I don't think EPERM helps
the nested signals, or could you explain the possible breakage
scenario?
It's probably safe in most cases, but the current behavior explicitly
checks whether you're on the alt stack during signal delivery,
... to not re-use it occasionally.
But if you set up a new sigaltstack, then perhaps the
kernel can just check for overlaps with the current one to stay
safe (but I'd rather not check anything).

The work-around from this, is not even trivial: I have
to use the shm tricks to duplicate the sigaltstack in
the VA space, and move the stack pointer to another
mirror before calling sigaltstack. Then I use longjmp()
to restore the stack pointer. Then I can finally use
swapcontext(). This is an unpleasant work-around.

The fix on a kernel side looks simple: kernel should
just use ss_flags to determine whether the sigaltstack
is active. I can make a patch for that, but the problem
is that the arch-specific code is not using any helper
function to check for sigaltstack; instead it just uses
"if (ss_size)" checks.
Huh? I'm not sure I understand what you're talking about. It seems
reasonable to have the invariant that ss_size != 0 if and only if an
alt stack is enabled, and do_sigaltstack seems to enforce that
invariant.
But we have that (IMO quite silly) requirement that the
returned oss->ss_flags is consistent.
So if inside the signal handler I use SS_DISABLE and
the kernel translates this into "ss_size = 0", the next
call to sigaltstack() will return 0 in oss->ss_flags.
It should returns SS_DISABLE, right?
Who knows?
Man page says:
SS_ONSTACK
The process is currently executing on the alternate signal
stack. (Note that it is not possible to change the alternate
signal stack if the process is currently executing on it.)

SS_DISABLE
The alternate signal stack is currently disabled.

Both applies, and, judging from the description of EPERM, I
can assume that SS_ONSTACK should be returned.
If we remove EPERM completely, then obviously SS_DISABLE
should be returned.

And it won't set SS_ONSTACK
because you're not in the alt stack because there is no alt stack.

Of course, there *was* an alt stack when the signal was delivered, and
you're on that stack.
Exactly.
Do you think this can be ignored?
A man page should then be corrected with EPERM and the
above note removed, right?

So the patch will need to update
all arches... I wonder if maybe someone can fix that
problem and update the arch-specific code. If not,
I'll probably need to update only the x86-specific code
and add an arch-specific define, which is a bit nasty.
Just change do_sigaltstack?
But if its that easy and we do not even need a consistent
oss->ss_flags - why not to remove the EPERM check entirely,
rather than only for SS_DISABLE? Note that if it is removed
only for SS_DISABLE and yet SS_DISABLE is translated to
"ss_size=0", then by the next sigaltstack() call you can do
whatever you want: the EPERM check will be entirely bypassed.
So if you are fine with even this, I can send the patch to
completely remove the check. Much easier for me. :)
I think the semantic of oss->ss_size is quite bad, but it is
already documented, so I am not sure.
I would send a patch to remove the check or a patch to add a new
SS_FORCE that disables the check. It should be just a couple of lines
of code. A selftests patch along with it would help. Cc linux-abi on
all of it.
Hmm, OKey. But this can potentially contradict the man page,
and I fear it can be rejected.
So don't be surprised if I add your Acked-by, warning-warning. :)

BTW, the sigcontext SS stuff is queued for -next. I doubt it'll make
4.5 since I think that all the relevant maintainers are just
recovering from vacations, and I already have a decent backlog of
stuff that hasn't landed in -tip yet.
Thanks for taking care of that!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/