Re: [PATCH] vsprintf/doc: Document format flags including field width and precision

From: Rasmus Villemoes
Date: Mon May 22 2023 - 17:05:19 EST


On 22/05/2023 17.08, Petr Mladek wrote:
> The kernel implementation of vsprintf() tries to be as compatible with
> the user space variant as possible. Though it does not implement all
> features. On the other hand, it adds some special pointer printing
> modifiers.
>
> Most differences are described in Documentation/core-api/printk-formats.rst
> Add the missing documentation of the supported flag characters
> '#', '0', '-', ' ', '+' together with field width and precision modifiers.
>
> Suggested-by: Luca Weiss <luca.weiss@xxxxxxxxxxxxx>
> Signed-off-by: Petr Mladek <pmladek@xxxxxxxx>
> ---
> What about something like this, please?
>
> Documentation/core-api/printk-formats.rst | 69 +++++++++++++++++++++++
> 1 file changed, 69 insertions(+)
>
> diff --git a/Documentation/core-api/printk-formats.rst b/Documentation/core-api/printk-formats.rst
> index dfe7e75a71de..79655b319658 100644
> --- a/Documentation/core-api/printk-formats.rst
> +++ b/Documentation/core-api/printk-formats.rst
> @@ -8,6 +8,75 @@ How to get printk format specifiers right
> :Author: Andrew Murray <amurray@xxxxxxxxxxxxxx>
>
>
> +Flag characters
> +===============
> +
> +The character '%' might be followed by the following flags that modify
> +the output:
> +
> + - '#' - prepend '0', '0x', or 'OX for 'o', 'x', 'X' number conversions
> + - '0' - zero pad number conversions on the field boundary
> + - '-' - left adjust on the field boundary, blank pad on the right
> + - ' ' - prepend space on positive numbers
> + - '+' - prepend + for positive numbers when using signed formats

[I wonder if we have a single user of any of the latter two in the
entire tree.]

> +Examples::
> +
> + |%x| |1a|
> + |%#x| |0x1a|
> + |%d| |26|
> + |% d| | 26|
> + |%+d| |+26|
> +
> +
> +Field width
> +===========
> +
> +A field width may be defined when '%' is optionally followed by the above flag
> +characters and:
> +
> + - 'number' - the decimal number defines the field width
> + - '*' the field width is defined by an extra parameter
> +
> +Values are never truncated when the filed width is not big enough.

filed -> field (several places)

> +Spaces are used by default when a padding is needed.
> +
> +Examples::
> +
> + |%6d| | 26|
> + |%-6d| |26 |
> + |%06d| |000026|
> +
> + printk("Dynamic table: |%*d|%*s|\n", id_width, id, max_name_len, name);
> +
> +The filed width value might have special meaning for some pointer formats.
> +For example, it limits the size of the bitmap handled by %*pb format.

It should also be noted that a negative field width passed as a *
argument is interpreted as if the - flag is used and then the absolute
value is used as field width.

> +
> +
> +Field precision:
> +================
> +
> +A field width may be defined when '%' is optionally followed by the above flag
> +characters:
> +
> + - '.number' - the decimal number defines the field precision
> + - '.*' the field precision is defined by an extra parameter
> +
> +The precision defines:
> +
> + - number of digits after the decimal point in float number conversions

No, don't mention floats, the kernel doesn't do those.

> + - minimal number of digits in integer conversions
> + - maximum number of characters in string conversions
> +
> +Examples::
> +
> + |%.3f| |12.300|

Remove.

> + |%.6d| | 26|

Nope, that actually produces 000026.

---

So overall, I'm not sure this is a net win. I think it might be better
to emphasize that

- the kernel doesn't do floats, argument reordering via m$, wide
characters/strings, %m or %n (just so that's out of the equation)

- for string and integer conversions, the kernel's printf is very very
close to following POSIX/libc/whatever, in terms of flags, field width
etc. [There are a few exceptions, those I've found are documented in
test_printf.c, but nobody is ever likely to hit those.]

- for %p, the kernel has its own rules, starting with the fact that
modifying behaviour based on alphanumerics following the p is completely
non-standard.

and then spend the rest explaining those rules, and perhaps also some
background on why the %p extensions exist and why they are implemented
the way they are - for example "we want -Wformat to tell us if something
is wrong", but that, for example, means we can only use a field width
and not a precision to pass an extra argument to a %psomething. And
alphanumerics are chosen because nobody would usually follow a normal %p
by anything but whitespace or punctuation, and because the compiler
format checking is happy as long as there's some pointer argument
corresponding to the %p, and the remaining characters are, from the
compiler's POV, just literal characters.

Rasmus