Re: siginfo_t ABI break on sparc64 from si_addr_lsb move 3y ago

From: Eric W. Biederman
Date: Fri Apr 30 2021 - 13:08:36 EST


Arnd Bergmann <arnd@xxxxxxxx> writes:

> On Thu, Apr 29, 2021 at 7:23 PM Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote:
>
>> > Which option do you prefer? Are there better options?
>>
>> Personally the most important thing to have is a single definition
>> shared by all architectures so that we consolidate testing.
>>
>> A little piece of me cries a little whenever I see how badly we
>> implemented the POSIX design. As specified by POSIX the fields can be
>> place in siginfo such that 32bit and 64bit share a common definition.
>> Unfortunately we did not addpadding after si_addr on 32bit to
>> accommodate a 64bit si_addr.
>>
>> I find it unfortunate that we are adding yet another definition that
>> requires translation between 32bit and 64bit, but I am glad
>> that at least the translation is not architecture specific. That common
>> definition is what has allowed this potential issue to be caught
>> and that makes me very happy to see.
>>
>> Let's go with Option 3.
>>
>> Confirm BUS_MCEERR_AR, BUS_MCEERR_AO, SEGV_BNDERR, SEGV_PKUERR are not
>> in use on any architecture that defines __ARCH_SI_TRAPNO, and then fixup
>> the userspace definitions of these fields.
>>
>> To the kernel I would add some BUILD_BUG_ON's to whatever the best
>> maintained architecture (sparc64?) that implements __ARCH_SI_TRAPNO just
>> to confirm we don't create future regressions by accident.
>>
>> I did a quick search and the architectures that define __ARCH_SI_TRAPNO
>> are sparc, mips, and alpha. All have 64bit implementations.
>
> I think you (slightly) misread: mips has "#undef __ARCH_SI_TRAPNO", not
> "#define __ARCH_SI_TRAPNO". This means it's only sparc and
> alpha.
>
> I can see that the alpha instance was added to the kernel during linux-2.5,
> but never made it into the glibc or uclibc copy of the struct definition, and
> musl doesn't support alpha or sparc. Debian codesearch only turns up
> sparc (and BSD) references to si_trapno.



>> I did a quick search and the architectures that define __ARCH_SI_TRAPNO
>> are sparc, mips, and alpha. All have 64bit implementations. A further
>> quick search shows that none of those architectures have faults that
>> use BUS_MCEERR_AR, BUS_MCEERR_AO, SEGV_BNDERR, SEGV_PKUERR, nor do
>> they appear to use mm/memory-failure.c
>>
>> So it doesn't look like we have an ABI regression to fix.
>
> Even better!
>
> So if sparc is the only user of _trapno and it uses none of the later
> fields in _sigfault, I wonder if we could take even more liberty at
> trying to have a slightly saner definition. Can you think of anything that
> might break if we put _trapno inside of the union along with _perf
> and _addr_lsb?

On sparc si_trapno is only set when SIGILL ILL_TRP is set. So we can
limit si_trapno to that combination, and it should not be a problem for
a new signal/si_code pair to use that storage. Precisely because it is
new.

Similarly on alpha si_trapno is only set for:

SIGFPE {FPE_INTOVF, FPE_INTDIV, FPE_FLTOVF, FPE_FLTDIV, FPE_FLTUND,
FPE_FLTINV, FPE_FLTRES, FPE_FLTUNK} and SIGTRAP {TRAP_UNK}.

Placing si_trapno into the union would also make the problem that the
union is pointer aligned a non-problem as then the union immediate
follows a pointer.

I hadn't had a chance to look before but we must deal with this. The
definition of perf_sigtrap in 42dec9a936e7696bea1f27d3c5a0068cd9aa95fd
is broken on sparc, alpha, and ia64 as it bypasses the code in
kernel/signal.c that ensures the si_trapno or the ia64 special fields
are set.

Not to mention that perf_sigtrap appears to abuse si_errno.

The code is only safe if the analysis that says we can move si_trapno
and perhaps the ia64 fields into the union is correct. It looks like
ia64 much more actively uses it's signal extension fields including for
SIGTRAP, so I am not at all certain the generic definition of
perf_sigtrap is safe on ia64.

> I suppose in theory sparc64 or alpha might start using the other
> fields in the future, and an application might be compiled against
> mismatched headers, but that is unlikely and is already broken
> with the current headers.

If we localize the use of si_trapno to just a few special cases on alpha
and sparc I think we don't even need to worry about breaking userspace
on any architecture. It will complicate siginfo_layout, but it is a
complication that reflects reality.

I don't have a clue how any of this affects ia64. Does perf work on
ia64? Does perf work on sparc, and alpha?

If perf works on ia64 we need to take a hard look at what is going on
there as well.

Eric