Re: [PATCH] sysinfo: Saturate 16-bit procs rather than wrapping

From: Josh Triplett
Date: Wed Apr 05 2023 - 21:04:24 EST


On Wed, Apr 05, 2023 at 05:27:12PM -0500, Eric W. Biederman wrote:
> Josh Triplett <josh@xxxxxxxxxxxxxxxx> writes:
>
> > struct sysinfo has a 16-bit field for the number of processes. Current
> > systems can easily exceed this. Rather than wrapping around, saturate
> > the value at U16_MAX. This is still incorrect, but more likely to
> > help the user know what's going on; a caller can then (for instance)
> > parse the full value out of /proc/loadavg.
> >
> > Signed-off-by: Josh Triplett <josh@xxxxxxxxxxxxxxxx>
> > ---
> >
> > Not sure what tree changes to kernel/sys.c should flow through. Andrew,
> > could you take this through your tree (assuming you agree with it), or
> > suggest what tree it should go through instead?
>
>
> Mind if I ask what the motivation for this is?
[...]
> I looked because just saturating the 16bit field feels like a hack
> that will continue to encourage buggy programs to stay buggy.

On the contrary, it seemed to me like the best way to make issues more
obvious. Wrapping around and showing (say) 12 processes because the
results are mod 65536 won't necessarily immediately stand out in stats,
the way that saturating at 65535 does.

That said, I like the idea of doing 0 even more:

> If there is real value in sysinfo returning a this information someone
> could go through the work and update the kernel to return the high
> bits of the process count in info->pad that is immediately after
> info->procs, and then update the apps or libc to find those high bits.

I'd love to do that; I just thought that'd be less likely to be
accepted. But yes, I think that'd be even better.

That said, I think we need to return *all* the bits in the later
padding, rather than just the high bits, so that we can reliably detect
if the bits were provided. We can return 0 in the existing field, and
then return the process count in the padding, and if the padding is 0
then it wasn't provided.

(If we returned the high bits in the later padding, it'd be hard to tell
if low=42 and high=0 was 42 processes or 42-mod-65536 processes on an
old kernel. If we return the full bits in the later padding, we can
reliably tell the difference between those.)

- Josh Triplett