Re: [RFC] TIF_NOTIFY_RESUME, arch/*/*/*signal*.c and all such

From: Al Viro
Date: Wed Apr 25 2012 - 12:10:12 EST


On Wed, Apr 25, 2012 at 05:46:11PM +0200, Oleg Nesterov wrote:
> OK, I didn't really try to think, and somehow I simply can't wake up today.
> But perhaps we can do something the following? We add the new syscall
>
> sys_eintr(void)
> {
> return -EINTR;
> }
>
> (perhaps not strictly needed, perhaps we can reuse sys_restart_syscal)
>
> Now,
>
> --- x/arch/x86/kernel/signal.c
> +++ x/arch/x86/kernel/signal.c
> @@ -711,6 +711,13 @@ handle_signal(unsigned long sig, siginfo
> regs->ax = regs->orig_ax;
> regs->ip -= 2;
> break;
> +
> + case -EINTR:
> + break;
> +
> + default:
> + if (regs->orig_ax == NR_eintr)
> + regs->ax = NR_eintr;
> }
> }
>
> @@ -791,6 +798,7 @@ static void do_signal(struct pt_regs *re
> case -ERESTARTSYS:
> case -ERESTARTNOINTR:
> regs->ax = regs->orig_ax;
> + regs->orig_ax = NR_eintr;
> regs->ip -= 2;
> break;
>
>
> this ignores ERESTART_RESTARTBLOCK for simplicity.

Ehh... You do realize that it's that simple only on architectures that
have syscall number in a register? arm/parisc/unicore32 have really
painful code dealing with restart_syscall(2); building trampoline on
stack and all such... Take a look at e.g. insert_restart_trampoline()
in arch/parisc/kernel/signal.c (BTW, it doesn't check for put_user()
failures while building that thing).

FWIW, I wonder if e.g. arm would be better off with the following
trick: new flag added to _TIF_SYSCALL_WORK, with __sys_trace checking
it and if it's set, clearing it and simulating sys_restart_syscall().
ERESTART_RESTARTBLOCK would become practically identical to ERESTARTNOHAND,
except that if we decide to restart it we would set the flag. All the
trampoline crap goes away that way and restart logics gets simpler on
any platform doing that kind of thing.

I'm not sure that SA_RESTART case is actually worth bothering with -
AFAICS, your mechanism won't cover SA_RESTART/SA_RESTART/!SA_RESTART
combinations anyway, and I'm not at all sure that we _want_ to change
user-visible behaviour. Basically, you have
syscall interrupted
SA_RESTART signal arrives, handler1 entered
!SA_RESTART signal arrives, handler2 entered
handler2 returns
hander1 returns
syscall restarts
and to get behaviour independent of relative timing of those two signals
you'd have to replace the last one with "syscall fails with EINTR".
It's a user-visible change and I'm not at all sure that we won't break
userland code with it.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/