Re: [PATCH v3 1/2] epoll: add nsec timeout support with epoll_pwait2

From: Willem de Bruijn
Date: Thu Nov 19 2020 - 09:21:09 EST


On Wed, Nov 18, 2020 at 10:59 AM David Laight <David.Laight@xxxxxxxxxx> wrote:
>
> From: Arnd Bergmann
> > Sent: 18 November 2020 15:38
> >
> > On Wed, Nov 18, 2020 at 4:10 PM Willem de Bruijn
> > <willemdebruijn.kernel@xxxxxxxxx> wrote:
> > > On Wed, Nov 18, 2020 at 10:00 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
> > > >
> > > > On Wed, Nov 18, 2020 at 09:46:15AM -0500, Willem de Bruijn wrote:
> > > > > -static inline struct timespec64 ep_set_mstimeout(long ms)
> > > > > +static inline struct timespec64 ep_set_nstimeout(s64 timeout)
> > > > > {
> > > > > - struct timespec64 now, ts = {
> > > > > - .tv_sec = ms / MSEC_PER_SEC,
> > > > > - .tv_nsec = NSEC_PER_MSEC * (ms % MSEC_PER_SEC),
> > > > > - };
> > > > > + struct timespec64 now, ts;
> > > > >
> > > > > + ts = ns_to_timespec64(timeout);
> > > > > ktime_get_ts64(&now);
> > > > > return timespec64_add_safe(now, ts);
> > > > > }
> > > >
> > > > Why do you pass around an s64 for timeout, converting it to and from
> > > > a timespec64 instead of passing around a timespec64?
> > >
> > > I implemented both approaches. The alternative was no simpler.
> > > Conversion in existing epoll_wait, epoll_pwait and epoll_pwait
> > > (compat) becomes a bit more complex and adds a stack variable there if
> > > passing the timespec64 by reference. And in ep_poll the ternary
> > > timeout test > 0, 0, < 0 now requires checking both tv_secs and
> > > tv_nsecs. Based on that, I found this simpler. But no strong
> > > preference.
> >
> > The 64-bit division can be fairly expensive on 32-bit architectures,
> > at least when it doesn't get optimized into a multiply+shift.
>
> I'd have thought you'd want to do everything in 64bit nanosecs.
> Conversions to/from any of the 'timespec' structure are expensive.

I took another look at this.

The only real reason for the timespec64 is that
select_estimate_accuracy takes that type. Which makes sense, because
do_select does.

But for epoll, this is inefficient: in ep_set_mstimeout it calls
ktime_get_ts64 to convert timeout to an offset from current time, only
to pass it to select_estimate_accuracy to then perform another
ktime_get_ts64 and subtract this to get back to (approx.) the original
timeout.

How about a separate patch that adds epoll_estimate_accuracy with
the same rules (wrt rt_task, current->timer_slack, nice and upper bound)
but taking an s64 timeout.

One variation, since it is approximate, I suppose we could even replace
division by a right shift?

After that, using s64 everywhere is indeed much simpler. And with that
I will revise the new epoll_pwait2 interface to take a long long
instead of struct timespec.

Apologies for the delay. I forgot that I'm only subscribed to netdev@
in my main email account.