Re: rcutorture initrd/nolibc build on ARMv8?

From: Mark Rutland
Date: Wed Jan 20 2021 - 07:55:19 EST


On Tue, Jan 19, 2021 at 06:43:58PM +0100, Willy Tarreau wrote:
> On Tue, Jan 19, 2021 at 06:16:37PM +0100, Willy Tarreau wrote:
> Given that you used a native compiler we can't suspect an issue with a
> bare-metal compiler possibly affecting how kernel headers are passed
> there. But nevertheless, I'd still not disregard the possibility that
> the headers found under "linux/" are picked from the libc which for
> whatever reason would be missing a lot of them.

I think the actual issue here is a misapprehension in nolibc.h, which
started blowing up due to a refactoring in asm/unistd.h.

In nolibc.h, we do:

| /* Some archs (at least aarch64) don't expose the regular syscalls anymore by
| * default, either because they have an "_at" replacement, or because there are
| * more modern alternatives. For now we'd rather still use them.
| */
| #define __ARCH_WANT_SYSCALL_NO_AT
| #define __ARCH_WANT_SYSCALL_NO_FLAGS
| #define __ARCH_WANT_SYSCALL_DEPRECATED

... but this isn't quite right -- it's not that the syscalls aren't
exposed by default, but rather that these syscall numbers are not valid
for architectures which do not define the corresponding __ARCH_WANT_*
flags. Architectures without those have never implemented the syscalls,
and would have returned -ENOSYS for the unrecognized syscall numbers,
but the numbers could be allocated to (distinct) syscalls in future.

Since commit:

a0673fdbcd421052 ("asm-generic: clean up asm/unistd.h")

... those definitions got pulled out of <asm-generic/unistd.h>, and
hence it's no longer possible to accidentally get those where a
userspace header defines __ARCH_WANT_* in an architecture where they
don't exist (e.g. arm64).

It seems that the headers on my Debian 10.7 system were generated after
that commit, whereas yours were generated before that.

> We've seen that __NR_fork or __NR_dup2 for example were missing in your
> output, on my native machine I can see them, so that could give us a clue
> about the root cause of the issue:
>
> $ gcc -fno-asynchronous-unwind-tables -fno-ident -nostdlib -include nolibc.h -lgcc -s -static -E -dM init-fail.c | egrep '__NR_(fork|dup2)'
> #define __NR_dup2 1041
> #define __NR_syscalls (__NR_fork+1)
> #define __NR_fork 1079

As above, these are bogus for arm64. There is no syscall number for dup2
or fork, and __NR_syscalls is currently only 442.

I think the right thing to do is to have nolibc.h detect which syscalls
are implemented, and to not define __ARCH_WANT_*.

Thanks,
Mark.