Re: [PATCH 4/4] sigaltstack: allow disabling and re-enabling sas within sighandler

From: Stas Sergeev
Date: Sun Jan 31 2016 - 14:52:22 EST


31.01.2016 22:03, Andy Lutomirski ÐÐÑÐÑ:
On Sun, Jan 31, 2016 at 9:33 AM, Stas Sergeev <stsp@xxxxxxx> wrote:
31.01.2016 20:00, Andy Lutomirski ÐÐÑÐÑ:
On Sun, Jan 31, 2016 at 8:28 AM, Stas Sergeev <stsp@xxxxxxx> wrote:
linux implements the sigaltstack() in a way that makes it impossible to
use with swapcontext(). Per the man page, sigaltstack is allowed to
return
EPERM if the process is altering its sigaltstack while running on
sigaltstack.
This is likely needed to consistently return oss->ss_flags, that
indicates
whether the process is being on sigaltstack or not.
Unfortunately, linux takes that permission to return EPERM too literally:
it returns EPERM even if you don't want to change to another sigaltstack,
but only want to temporarily disable sigaltstack with SS_DISABLE.
You can't use swapcontext() without disabling sigaltstack first, or the
stack will be re-used and overwritten by a subsequent signal.

With this patch, disabling sigaltstack inside a signal handler became
possible, and the swapcontext() can then be used safely. After switching
back to the sighandler, the app can re-enable the sigatlstack.
The oss->ss_flags will correctly indicate the current use of sigaltstack,
even if it is temporarily disabled. Any attempt to modify the sigaltstack
(rather than to disable or re-enable it) within the sighandler, will
still
be punished with EPERM as suggested by POSIX.
This seems considerably more complicated than my previous proposal to
add an SS_FORCE flag to say "I know what I'm doing. Ignore POSIX and
let me change the sigaltstack configuration even if it's in use".
What's the advantage?
To me, the main advantage is to stay POSIX-compatible, rather
than adding the linux-specific flag. Please note that this flag does
not add any value other than to say "I want to ignore POSIX here"
in your interpretation of POSIX, and in my interpretation it doesn't
say even this, because in my interpretation the temporary disabling
is not prohibited.
POSIX says nothing about temporary anything. It says:

SS_ONSTACK The process is currently executing on the alternate signal
stack. Attempts to modify the alternate signal stack while
the process is executing on it fail. This flag shall not be
modified by processes.

It's a bit ambiguous because "Attempts to modify the alternate signal
stack while the process is executing on it fail." is under SS_ONSTACK,
so it's not quite clear whether SS_DISABLE is *also* supposed to fail
if on the stack.
I think you are quoting the discription of the "oss" struct,
i.e. the one that is returned. It is correctly saying that when
SS_ONSTACK is _returned_, you can't modify the alternate
stack. So you likely confuse the above quote with the value
that is being set, not returned.
The ambiguity I am pointing to, is different. Above says you
can't _modify_ the sas. Modify likely means the modification
after which SS_ONSTACK will not be returned any more, i.e.
set to the different location. And I am not modifying it, and
the proof is that SS_ONSTACK is still returned as before.

So if it doesn't even fit my interpretation, how would
I write a man description for it? I'll have to first clarify the vague
wording to clearly sound your way, and then add the flag to override
this. This whole procedure looks very illogical to me. So to find out
if it is just me, I'd like to hear from anyone else supporting the idea
of adding this flag. If people think its existence is justified, then fine.
But to me this flag is non-portable, while the both sigaltstack() and
swapcontext() are portable. So what will I gain with adding a
non-portable flag to my apps? A bunch of ifdefs?
IMHO as long as both swapcontext() and sigaltstack() are POSIX-compatible,
they should be compatible with each other in a POSIX-compatible
way. If POSIX needs the linux-specific flags to make them compatible
with each other, then POSIX is inconsistent. So lets just don't interpret
it the way that makes it so.
What do other operating systems do here? You might be stuck with
Linux-specific code here no matter what. If you're causing Linux to
match FreeBSD, that's a different store.
Likely not.
But the most intuitive way for the programmer is to just
use SS_DISABLE so I wonder if some OSes allow that.
FreeBSD seems not quite the case though:
http://www.hpdc.syr.edu/~chapin/cis657/FreeBSD_5.2.1_Doxygen/kern__sig_8c-source.html
So I don't know how they solve the swapcontext() problem.
Perhaps I should ask them.

So in short:
Your concern is the patch complexity. Doing things your way will
however move the problem to the user: he will have to deal with the
linux-specific flags and add ifdefs for just a mere use of a
posix-compatible
interfaces.

There can also be the subtle technical differences.
With your approach the nested signal can AFAIU overflow the
the disabled sigaltstack because you don't maintain the oss->ss_flags
in a consistent way. There is an overflow protection code:
---
/*
* If we are on the alternate signal stack and would overflow it, don't.
* Return an always-bogus address instead so we will die with SIGSEGV.
*/
if (onsigstack && !likely(on_sig_stack(sp)))
return (void __user *)-1L;
---
In your approach it will be bypassed.
And its not possible for an app to find out if it is running on a
sigaltstack now or not, after it is disabled.
An app can figure out if it's on the altstack the same way the kernel
does.
With inline asm or register variable, not good for portability.
And only if it knows at the particular place to where was the
alt stack set before.
And if it to ask the kernel via sigaltstack(NULL, &oss), then with your
approach the kernel's reply will contradict with what an app have.

In fact, the app needs to be quite careful with this temporary
disable thing. If you temporarily disable sigaltstack, then
swapcontext, then you need to keep track of exactly when you're
supposed to re-enable it, which involves knowing what's going on with
the stacks.
My test-case code simply does this in sighandler:

sigaltstack(SS_DISABLE);
swapcontext();
sigaltstack(SS_ENABLE);

which means that only the sighandler context should
manage the sigaltstack, then you are safe. Non-signal
context should never care.

Also, consider a use case like yours but with *two* contexts that use
their own altstack. If you go to context A, enable sigaltstack, get a
signal, temporarily disable, then swapcontext to B, which tries to
re-enable its own sigaltstack, then everything gets confusing with
your patch, because, with your patch, the kernel is only tracking one
temporarily disabled sigaltstack.
Of course the good practice is to set the sigaltstack
before creating the contexts. Then the above scenario
should involve switching between 2 signal handlers to get
into troubles. I think the scenario with switching between
2 signal handlers is very-very unrealistic.

So that's another argument in favor of my thought that there should
just be a way to override the permission checks to turn sigaltstack
all the way off or to reprogram it even if you're running on the
altstack.
Ok so if you block this approach, I wonder how many people
will block the SS_FORCE approach, esp given that the approach
without any new flags was already released and seen.
Of course if no one else cares, I'll have to do that sooner or later.
Btw, you can do that too; you can even re-use my test case to
save time. :)