Re: [RFC] Printing numbers in SI units

From: Petr Mladek
Date: Wed Feb 28 2024 - 06:20:33 EST


On Wed 2024-01-24 23:43:15, Rasmus Villemoes wrote:
> On 24/01/2024 19.58, Matthew Wilcox wrote:
> > I was looking at hugetlbfs and it has several snippets of code like
> > this:
> >
> > string_get_size(huge_page_size(h), 1, STRING_UNITS_2, buf, 32);
> > pr_warn("HugeTLB: allocating %u of page size %s failed node%d. Only allocated %lu hugepages.\n",
> > h->max_huge_pages_node[nid], buf, nid, i);
> >
> > That's not terribly ergonomic, so I wondered if I could do better.
> > Unfortunately, I decided to do it using the SPECIAL flag which GCC
> > warns about. But I've written the code now, so I'm sending it out in
> > case anybody has a better idea for how to incorporate it.
>
> Well, something that gcc will warn about with Wformat isn't gonna fly,
> obviously. But my man page also mentions ' as a possible flag for d
> conversions:
>
> ' For decimal conversion (i, d, u, f, F, g, G) the output is
> to be grouped with thousands'
> grouping characters if the locale information indicates any.
>
> Obviously, our printf wouldn't implement that, but "grouping by
> (roughly) 1000s" is kinda related to what you want the output to be, so
> it seems apt. The man page also says "Note that many versions of gcc(1)
> cannot parse this option and will issue a warning.", but I think that's
> an ancient and irrelevant note. None of the gcc (or clang) versions
> godbolt knows about give that warning.

I am personally not a big fan of using random printf() modifiers for
a non-standard behavior.

For me, it is much easier to look up what string_get_size() does.
The alternative would be to check "man 3 printf" just to realize that
%#d or %'d does something else in the kernel. And where the hell is
the behavior documented. I know it but how many others do?

And if I wanted to use it in a new code then I would not expect
that it is supported by vsprintf() kernel implementation. I would
try to find it in the existing string helpers and get sad that
it is not there.

I see only 18 users:

$> git grep string_get_size | grep -v -e test -e string_helpers | wc -l
18


> I'm not really sure the implementation should imply a trailing B; that
> might as well be included in the format string itself - perhaps the
> quantity is bits, so a lowercase b is better, or perhaps it's some
> frequency so the user wants Hz.
>
> As for frequency, one would probably prefer the real, 1000^n, SI
> prefixes and not the IEC 1024^n ones, and regardless, I think there
> should be some way of doing both STRING_UNITS_2 and STRING_UNITS_10. At
> first I thought one could distinguish by using either one or two ', but
> gcc does warn for repeated flags. Two options I can think of: (a)
> Overload the precision, so a precision p >= 10 means STRING_UNITS_10
> with actual precision being p-10 (and yes, by all means cap that at 2).

Yet another non-standard behavior.

> (b) Currently %i and %d are completely equivalent, we could make them
> different in case a ' flag is present and make one do the 2^n and other
> the 10^n.

Same here.

IMHO, this "feature" would add more harm than good. And it is not worth it.

Best Regards,
Petr