Re: [PATCH v2 next 06/11] tools/nolibc/printf: Use bit-masks to hold requested flag, length and conversion chars

From: Thomas Weißschuh

Date: Wed Feb 18 2026 - 12:36:40 EST

On 2026-02-16 22:47:36+0000, David Laight wrote:
> On Mon, 16 Feb 2026 20:52:49 +0100
> Thomas Weißschuh <linux@xxxxxxxxxxxxxx> wrote:
>
> > On 2026-02-06 19:11:16+0000, david.laight.linux@xxxxxxxxx wrote:
> > > From: David Laight <david.laight.linux@xxxxxxxxx>
> > >
> > > Use flags bits (1u << (ch & 31)) for the flags, length modifiers, and
> > > conversion specifiers.
> > > This makes it easy to test for multiple values at once.
> > >
> > > Detect the conversion flags " #+-0" although they are currently all ignored.
> > >
> > > Add support for length modifiers 't' and 'z' (both long) and 'q' and 'L'
> > > (both long long).
> > >
> > > Add support for "%i" (the same as %d").
> >
> > Would it be much work to split the new functionality into its own patch?
> > >
> > > Unconditionally generate the signed values (for %d) to remove a second
> > > set of checks for the size.
> > >
> > > Signed-off-by: David Laight <david.laight.linux@xxxxxxxxx>
> > > ---
> > >
> > > Changes for v2:
> > > - Use #defines to make the code a lot more readable.
> > > - Include the changes from the old patch 10 that used masks for the
> > > conversion specifiers.
> > > - Detect all the valid flag characters even though they are not implemented.
> > > - Support for left justifying field is moved to patch 7.
> > >
> > > tools/include/nolibc/stdio.h | 151 ++++++++++++++++++++++++-----------
> > > 1 file changed, 103 insertions(+), 48 deletions(-)
> > >
> > > diff --git a/tools/include/nolibc/stdio.h b/tools/include/nolibc/stdio.h
> > > index bb54f488c228..b14cf8224403 100644
> > > --- a/tools/include/nolibc/stdio.h
> > > +++ b/tools/include/nolibc/stdio.h
> > > @@ -240,19 +240,44 @@ char *fgets(char *s, int size, FILE *stream)
> > > }
> > >
> > >
> > > -/* minimal printf(). It supports the following formats:
> > > - * - %[l*]{d,u,c,x,p}
> > > - * - %s
> > > - * - unknown modifiers are ignored.
> > > +/* simple printf(). It supports the following formats:
> > > + * - %[-][width][{l,t,z,ll,L,j,q}]{d,u,c,x,p,s,m,%}
> > > + * - %%
> > > + * - invalid formats are copied to the output buffer
> > > */
> > > +
> > > +/* This code uses 'flag' variables that are indexed by the low 6 bits
> > > + * of characters to optimise checks for multiple characters.
> > > + *
> > > + * _NOLIBC_PF_FLAGS_CONTAIN(flags, 'a', 'b'. ...)
> > > + * returns non-zero if the bit for any of the specified characters is set.
> > > + *
> > > + * _NOLIBC_PF_CHAR_IS_ONE_OF(ch, 'a', 'b'. ...)
> > > + * returns the flag bit for ch if it is one of the specified characters.
> > > + * All the characters must be in the same 32 character block (non-alphabetic,
> > > + * upper case, or lower case) of the ASCII character set.)
> > > + */
> > > +#define _NOLIBC_PF_FLAG(ch) (1u << ((ch) & 0x1f))
> > > +#define _NOLIBC_PF_FLAG_NZ(ch) ((ch) ? _NOLIBC_PF_FLAG(ch) : 0)
> > > +#define _NOLIBC_PF_FLAG8(cmp_1, cmp_2, cmp_3, cmp_4, cmp_5, cmp_6, cmp_7, cmp_8, ...) \
> > > + (_NOLIBC_PF_FLAG_NZ(cmp_1) | _NOLIBC_PF_FLAG_NZ(cmp_2) | \
> > > + _NOLIBC_PF_FLAG_NZ(cmp_3) | _NOLIBC_PF_FLAG_NZ(cmp_4) | \
> > > + _NOLIBC_PF_FLAG_NZ(cmp_5) | _NOLIBC_PF_FLAG_NZ(cmp_6) | \
> > > + _NOLIBC_PF_FLAG_NZ(cmp_7) | _NOLIBC_PF_FLAG_NZ(cmp_8))
> > > +#define _NOLIBC_PF_FLAGS_CONTAIN(flags, ...) \
> > > + ((flags) & _NOLIBC_PF_FLAG8(__VA_ARGS__, 0, 0, 0, 0, 0, 0, 0))
> > > +#define _NOLIBC_PF_CHAR_IS_ONE_OF(ch, cmp_1, ...) \
> > > + (ch < (cmp_1 & ~0x1f) || ch > (cmp_1 | 0x1f) ? 0 : \
> > > + _NOLIBC_PF_FLAGS_CONTAIN(_NOLIBC_PF_FLAG(ch), cmp_1, __VA_ARGS__))
> >
> > With signed chars I get:
> >
> > sysroot/i386/include/stdio.h:321:37: error: comparison is always false due to limited range of data type [-Werror=type-limits]
> > 321 | (ch < (cmp_1 & ~0x1f) || ch > (cmp_1 | 0x1f) ? 0 : \
>
> Stupid type-limits warning.
> It is almost impossible to get rid of without making the code unreadable.
>
> > | ^
> > sysroot/i386/include/stdio.h:389:35: note: in expansion of macro '_NOLIBC_PF_CHAR_IS_ONE_OF'
> > 389 | ch_flag = _NOLIBC_PF_CHAR_IS_ONE_OF(ch, 'c', 'd', 'i', 'u', 'x', 'p');
> > |
> >
> > This can be fixed by switching 'ch' to be always unsigned.
>
> That's likely to provoke another error elsewhere.
> In any case optimising that test away makes the code smaller!

We will need to get rid of this warning in some way.
Using pragmas in nolibc proper is something we don't want to do.

> > You can run the the builtin test suite like this:
> > -p triggers the download of the toolchains
> > -l uses LLVM/clang instead of the downloaded toolchain
> >
> > $ cd tools/testing/selftests/nolibc
> > $ ./run-tests.sh -m user -p
> > $ ./run-tests.sh -m user -l
>
> I've just been running:
> rm nolibc-test; make -O /path/.../dir && ./nolibc-test printf

This will miss weird things going on between different architectures.
Probably not directly relevant for changes to printf, but still useful.
And it shows the type limits warning.

> > > +
> > > typedef int (*__nolibc_printf_cb)(void *state, const char *buf, size_t size);
> > >
> > > static __attribute__((unused, format(printf, 3, 0)))
> > > int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list args)
> > > {
> > > - char lpref, ch;
> > > - unsigned long long v;
> > > + char ch;
> > > unsigned int written, width;
> > > + unsigned int flags, ch_flag;
> > > size_t len;
> > > char tmpbuf[21];
> > > const char *outstr;
> > > @@ -265,6 +290,7 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
> > > break;
> > >
> > > width = 0;
> > > + flags = 0;
> > > if (ch != '%') {
> > > while (*fmt && *fmt != '%')
> > > fmt++;
> > > @@ -274,6 +300,14 @@ int __nolibc_printf(__nolibc_printf_cb cb, void *state, const char *fmt, va_list
> > >
> > > ch = *fmt++;
> > >
> > > + /* Conversion flag characters */
> > > + for (;; ch = *fmt++) {
> > > + ch_flag = _NOLIBC_PF_CHAR_IS_ONE_OF(ch, ' ', '#', '+', '-', '0');
> > > + if (!ch_flag)
> > > + break;
> > > + flags |= ch_flag;
> > > + }
> >
> > What is the advantage of this over:
> >
> > while (1) {
> > /* ... */
> >
> > ch = *fmt++;
> > }
>
> One line shorter :-)

Meh. Let's not optimize for that.

> > Or combine it with the 'ch = *fmt++' from above and do:
> >
> > while (1) {
> > ch = *fmt++;
> >
> > /* ... */
> > }
> >
> > These look much simpler to me.
>
> The code always needs to get a new character after using one.
> So you come out of each bit of code with the 'next char to check' in ch.
> Which means you need a new character at the end of the loop for the next
> iteration.

Fair enough, but let's use some clearer logic.

> > > +
> > > /* width */
> > > while (ch >= '0' && ch <= '9') {
> > > width *= 10;
> >
> > (...)
> >
> > > +do_output:
> > > written += len;
> > >
> > > + /* An OPTIMIZER_HIDE_VAR() seems to stop gcc back-merging this
> > > + * code into one of the conditionals above.
> > > + */
> > > + __asm__ volatile("" : "=r"(len) : "0"(len));

(...)

> > Why separate input and ouput arguments instead of one combined one ('+r')?
> > I have been wondering the same about the kernel definition, too.
>
> I'm not sure either.
> Certainly "+r" is more modern - which is why it isn't used in a lot of places.
> They may be identical (indeed "+r" might get converted to the "0" form),
> but maybe it gives better separation - I just copied the same version.

Okay. We use '+r' in the syscall wrappers, so availability should not be
an issue. But I don't really care on way or another.

Thomas