[PATCH RESEND v2] fs/select.c: batch user writes in do_sys_poll

From: Daniel Axtens
Date: Wed Nov 18 2020 - 19:43:05 EST


When returning results to userspace, do_sys_poll repeatedly calls
put_user() - once per fd that it's watching.

This means that on architectures that support some form of
kernel-to-userspace access protection, we end up enabling and disabling
access once for each file descripter we're watching. This is inefficent
and we can improve things. We could do careful batching of the opening
and closing of the access window, or we could just copy the entire walk
entries structure. While that copies more data, it potentially does so
more efficiently, and the overhead is much less than the lock/unlock
overhead.

Unscientific benchmarking with the poll2_threads microbenchmark from
will-it-scale, run as `./poll2_threads -t 1 -s 15`:

- Bare-metal Power9 with KUAP: ~49% speed-up
- VM on amd64 laptop with SMAP: ~25% speed-up

Signed-off-by: Daniel Axtens <dja@xxxxxxxxxx>
---
fs/select.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/fs/select.c b/fs/select.c
index ebfebdfe5c69..4a74d1353ccb 100644
--- a/fs/select.c
+++ b/fs/select.c
@@ -1012,12 +1012,10 @@ static int do_sys_poll(struct pollfd __user *ufds, unsigned int nfds,
poll_freewait(&table);

for (walk = head; walk; walk = walk->next) {
- struct pollfd *fds = walk->entries;
- int j;
-
- for (j = 0; j < walk->len; j++, ufds++)
- if (__put_user(fds[j].revents, &ufds->revents))
- goto out_fds;
+ if (copy_to_user(ufds, walk->entries,
+ sizeof(struct pollfd) * walk->len))
+ goto out_fds;
+ ufds += walk->len;
}

err = fdcount;
--
2.25.1