Re: [PATCH v3 1/2] epoll: add nsec timeout support with epoll_pwait2
From: Arnd Bergmann
Date: Thu Nov 19 2020 - 10:46:14 EST
On Thu, Nov 19, 2020 at 3:31 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
>
> On Thu, Nov 19, 2020 at 09:19:35AM -0500, Willem de Bruijn wrote:
> > But for epoll, this is inefficient: in ep_set_mstimeout it calls
> > ktime_get_ts64 to convert timeout to an offset from current time, only
> > to pass it to select_estimate_accuracy to then perform another
> > ktime_get_ts64 and subtract this to get back to (approx.) the original
> > timeout.
Right, it would be good to avoid the second ktime_get_ts64(), as reading
the clocksource itself can be expensive.
> > How about a separate patch that adds epoll_estimate_accuracy with
> > the same rules (wrt rt_task, current->timer_slack, nice and upper bound)
> > but taking an s64 timeout.
> >
> > One variation, since it is approximate, I suppose we could even replace
> > division by a right shift?
The right shift would work indeed, but it's also a bit ugly unless
__estimate_accuracy() is changed to always use the same shift.
I see that on 32-bit ARM, select_estimate_accuracy() calls
the external __aeabi_idiv() function to do the 32-bit division, so
changing it to a shift would speed up select as well.
Changing select_estimate_accuracy() to take the relative timeout
as an argument to avoid the extra ktime_get_ts64() should
have a larger impact.
> > After that, using s64 everywhere is indeed much simpler. And with that
> > I will revise the new epoll_pwait2 interface to take a long long
> > instead of struct timespec.
>
> I think the userspace interface should take a struct timespec
> for consistency with ppoll and pselect. And epoll should use
> poll_select_set_timeout() to convert the relative timeout to an absolute
> endtime. Make epoll more consistent with select/poll, not less ...
I don't see a problem with an s64 timeout if that makes the interface
simpler by avoiding differences between the 32-bit and 64-bit ABIs.
More importantly, I think it should differ from poll/select by calculating
and writing back the remaining timeout.
I don't know what the latest view on absolute timeouts at the syscall
ABI is, it would probably simplify the implementation, but make it
less consistent with the others. Futex uses absolute timeouts, but
is itself inconsistent about that.
Arnd