Re: [PATCH 11/23] y2038: rusage: use __kernel_old_timeval

From: Arnd Bergmann
Date: Thu Nov 14 2019 - 05:18:35 EST


On Thu, Nov 14, 2019 at 1:38 AM Christian Brauner
<christian.brauner@xxxxxxxxxx> wrote:
> On Wed, Nov 13, 2019 at 11:02:12AM +0100, Arnd Bergmann wrote:
> > On Tue, Nov 12, 2019 at 10:09 PM Cyrill Gorcunov <gorcunov@xxxxxxxxx> wrote:
> > >
> > > On Fri, Nov 08, 2019 at 10:12:10PM +0100, Arnd Bergmann wrote:
> >
> > > > ---
> > > > Question: should we also rename 'struct rusage' into 'struct __kernel_rusage'
> > > > here, to make them completely unambiguous?
> > >
> > > The patch looks ok to me. I must confess I looked into rusage long ago
> > > so __kernel_timespec type used in uapi made me nervious at first,
> > > but then i found that we've this type defined in time_types.h uapi
> > > so userspace should be safe. I also like the idea of __kernel_rusage
> > > but definitely on top of the series.
> >
> > There are clearly too many time types at the moment, but I'm in the
> > process of throwing out the ones we no longer need now.
> >
> > I do have a number patches implementing other variants for the syscall,
> > and I suppose that if we end up adding __kernel_rusage, that would
> > have to go with a set of syscalls using 64-bit seconds/nanoseconds
> > rather than the old 32/64 microseconds. I don't know what other
> > changes remain that anyone would want from sys_waitid() now that
> > it does support pidfd.
> >
> > If there is still a need for a new waitid() replacement, that should take
> > that new __kernel_rusage I think, but until then I hope we are fine
> > with today's getrusage+waitid based on the current struct rusage.
>
> Note, that glibc does _not_ expose the rusage argument, i.e. most of
> userspace is unaware that waitid() does allow you to get rusage
> information. So users first need to know that waitid() has an rusage
> argument and then need to call the waitid() syscall directly.

On architectures that don't have a wait4 syscall (riscv32 for now),
glibc uses waitid to implement wait4 and wait3.

> > BSD has wait6() to return separate rusage structures for 'self' and
> > 'children', but I could not find any application (using the freebsd
> > sources and debian code search) that actually uses that information,
> > so there might not be any demand for that.
>
> Speaking specifically for Linux now, I think that rusage does not
> actually expose the information most relevant users are interested in.
> On Linux nowadays it is _way_ more interesting to retrieve stats
> relative to the cgroup the task lived in etc.
> Doing a git grep -i rusage in the systemd source code shows that rusage
> is used _nowhere_. And I consider an init system to be the most likely
> candidate to be interested in rusage.

I looked at a couple of implementations of time(1), this is one example
that sometimes uses wait3(), though other implementations just call
getrusage() in the parent process before the fork/exec. None of them
actually seem to report better than millisecond resolution, so there is
not a strict reason to do a timespec replacement for these.

Arnd