Re: sparc/ppc/arm compat siginfo ABI regressions: sending SIGFPE via kill() returns wrong values in si_pid and si_uid

From: Dave Martin
Date: Fri Apr 13 2018 - 14:56:36 EST


On Fri, Apr 13, 2018 at 07:50:17PM +0100, Russell King - ARM Linux wrote:
> On Fri, Apr 13, 2018 at 07:35:38PM +0100, Dave Martin wrote:
> > If that's the case though, I don't see how a userspace testsuite is
> > hitting this code path. Maybe I've misunderstood the context of this
> > thread.
>
> It isn't hitting this exact case.
>
> The userspace testsuite is hitting an entirely different case:
>
> kill(getpid(), SIGFPE);
>
> As one expects, this generates a SIGFPE to the current process, which
> then inspects the siginfo structure. Being a userspace generated
> signal, si_code is set to SI_USER, which has the value 0.
>
> With FPE_FIXME defined to zero, as Eric has done:
>
> enum siginfo_layout siginfo_layout(int sig, int si_code)
> {
> enum siginfo_layout layout = SIL_KILL;
> if ((si_code > SI_USER) && (si_code < SI_KERNEL)) {
> ...
> } else {
> ...
> #ifdef FPE_FIXME
> if ((sig == SIGFPE) && (si_code == FPE_FIXME))
> layout = SIL_FAULT;
> #endif
> }
> return layout;
> }
>
> This causes siginfo_layout() to return SIL_FAULT for this userspace
> generated signal, rather than the correct SIL_KILL.
>
> This affects which fields we copy to userspace.
>
> SI_USER is defined to pass si_pid and si_uid to the userspace process,
> which on ARM are the first two consecutive 32-bit quantities in the
> union, which is done when siginfo_layout() returns SIL_KILL. However,
> when SIL_FAULT is returned, we only copy si_addr in the union, which
> on ARM is just one 32-bit quantity.
>
> Consequently, userspace sees a correct value for si_pid, and si_uid
> remains set to whatever was there in userspace. In the case of the
> strace program, that's zero. This means if you run the strace
> testsuite as root, the problem doesn't appear, but if you run it as
> a non-root user, it will.
>
> So, the testsuite case has little to do with the behaviour of the VFP
> handling - it's to do with the behaviour of the kernel's signal handling.

Oh, right. So, going back to the unhandled VFP bounce question,
is it reasonable for that to be a SIGKILL? That avoids the question
of what si_code userspace should see, because userspace doesn't get
to see siginfo at all in that case: it's dead.

Or do we hit this in real situations that we want userspace to bail out
of more gracefully?

Cheers
---Dave