Re: [RFC PATCH v2 2/8] tools/nolibc: Remove .global _start from the entry point code

From: Willy Tarreau
Date: Tue Mar 22 2022 - 13:58:35 EST


On Tue, Mar 22, 2022 at 10:30:53AM -0700, Nick Desaulniers wrote:
> (Moving folks to bcc; check the lists if you're interested)

Yes, agreed :-)

> On Tue, Mar 22, 2022 at 10:25 AM Willy Tarreau <w@xxxxxx> wrote:
> > The purpose is clearly *not* to implement a libc, but to have
> > something very lightweight that allows to compile trivial programs. A good
> > example of this is tools/testing/selftests/rcutorture/bin/mkinitrd.sh. I'm
> > personally using a tiny pre-init shell that I always package with my
> > kernels and that builds with them [1]. It will never do big things but
> > the balance between ease of use and coding effort is pretty good in my
> > experience. And I'm also careful not to make it complicated to use nor
> > to maintain, pragmatism is important and the effort should remain on the
> > program developer if some arbitration is needed.
>
> Neat, I bet that helps generate very small initrd! Got any quick size
> measurements?

Yep:

First, the usual static printf("hello world!\n"):

$ ll hello-*libc
-rwxrwxr-x 1 willy dev 719232 Mar 22 18:50 hello-glibc*
-rwxrwxr-x 1 willy dev 1248 Mar 22 18:51 hello-nolibc*

$ objdump -h hello-nolibc
hello-nolibc: file format elf64-x86-64

Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00000300 00000000004000b0 00000000004000b0 000000b0 2**0
CONTENTS, ALLOC, LOAD, READONLY, CODE
1 .rodata 00000015 00000000004003b0 00000000004003b0 000003b0 2**0
CONTENTS, ALLOC, LOAD, READONLY, DATA

Then the preinit stuff:

$ ll initramfs/init
-rwxr-xr-x 1 willy users 13936 Mar 22 18:40 initramfs/init*

$ xz -c9 < initramfs/init | wc -c
8392

$ size initramfs/init
text data bss dec hex filename
13348 0 23016 36364 8e0c init

$ objdump -h initramfs/init
initramfs/init: file format elf64-x86-64
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00002b74 00000000004000e8 00000000004000e8 000000e8 2**0
CONTENTS, ALLOC, LOAD, READONLY, CODE
1 .rodata 000008b0 0000000000402c60 0000000000402c60 00002c60 2**5
CONTENTS, ALLOC, LOAD, READONLY, DATA
2 .bss 000059e8 0000000000404520 0000000000404520 00003520 2**5
ALLOC

This one supports ~30-40 simple commands (mount/unmount, mknod, ls, ln),
a tar extractor, multi-level braces, and boolean expression evaluation,
variable expansion, and a config file parser to script all this. The code
is 20 years old and is really ugly (even uglier than you think). But that
gives an idea. 20 years ago the init was much simpler and 800 bytes (my
constraint was for single floppies containing kernel+rootfs) and strings
were manually merged by tails and put in .text to drop .rodata.

You'll also note that there's 0 data segment above. That used to be
convenient to further shrink programs, but these days given how linkers
arrange segments by permissions that doesn't save as much as it used to,
and it's likely that at some points I'll assume that there must be some
variables by default (errno, environ, etc) and that we'll accept to invest
a few extra tens of bytes by default for more convenience.

Cheers,
Willy