Re: Strange issues with epoll since 5.0

From: Deepa Dinamani
Date: Wed May 01 2019 - 14:37:51 EST


Thanks for trying the fix.

So here is my analysis:

Let's start with epoll_pwait:

ep_poll() is what checks for signal_pending() and is responsible for
setting errno to -EINTR when there is a signal.

So if a signal is received after ep_poll(), it is never noticed by the
syscall during execution.

Moreover, the original code before
854a6ed56839a40f6b5d02a2962f48841482eec4 ("signal: Add
restore_user_sigmask()"), had the following call flow:

error = do_epoll_wait(epfd, events, maxevents, timeout);

**** Here error = 0 if the signal is received after ep_poll().

- /*
- * If we changed the signal mask, we need to restore the original one.
- * In case we've got a signal while waiting, we do not restore the
- * signal mask yet, and we allow do_signal() to deliver the signal on
- * the way back to userspace, before the signal mask is restored.
- */
- if (sigmask) {
- if (error == -EINTR) {
- memcpy(&current->saved_sigmask, &sigsaved,
- sizeof(sigsaved));
- set_restore_sigmask();
- } else

**** Execution reaches this else statement and the sigmask is restored
directly, ignoring the newly generated signal. The signal is never
handled.

- set_current_blocked(&sigsaved);
- }

In the current execution flow:

error = do_epoll_wait(epfd, events, maxevents, timeout);

**** error is still 0 as ep_poll() did not detect the signal.

restore_user_sigmask(sigmask, &sigsaved, error == -EITNR);

void restore_user_sigmask(const void __user *usigmask, sigset_t *sigsaved)
{

if (!usigmask)
return;
/*
* When signals are pending, do not restore them here.
* Restoring sigmask here can lead to delivering signals that the above
* syscalls are intended to block because of the sigmask passed in.
*/
if (signal_pending(current)) {

**** execution path reaches here and do_signal() actually delivers the
signal to userspace. But the errno is not set. So the userspace fails
to notice it.

current->saved_sigmask = *sigsaved;
set_restore_sigmask();
return;
}

/*
* This is needed because the fast syscall return path does not restore
* saved_sigmask when signals are not pending.
*/
set_current_blocked(sigsaved);
}

For other syscalls in the same commit:

sys_io_pgetevents() does not seem to have this problem as we are still
checking signal_pending() here.
sys_pselect6() seems to have a similar problem. The changes to
sys_pselect6() also impact sys_select() as the changes are in the
common code path.

So the 854a6ed56839a40f6 seems to be better than the original code in
that it detects the signal. But, the problem is that it doesn't
communicate it to the userspace.

So a patch like below solves the problem. This is incomplete. I'll
verify and send you a proper fix you can test soon. This is just for
the sake of discussion:

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index 4a0e98d87fcc..63a387329c3d 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -2317,7 +2317,7 @@ SYSCALL_DEFINE6(epoll_pwait, int, epfd, struct
epoll_event __user *, events,
int, maxevents, int, timeout, const sigset_t __user *, sigmask,
size_t, sigsetsize)
{
- int error;
+ int error, signal_detected;
sigset_t ksigmask, sigsaved;

/*
@@ -2330,7 +2330,10 @@ SYSCALL_DEFINE6(epoll_pwait, int, epfd, struct
epoll_event __user *, events,

error = do_epoll_wait(epfd, events, maxevents, timeout);

- restore_user_sigmask(sigmask, &sigsaved);
+ signal_detected = restore_user_sigmask(sigmask, &sigsaved);
+
+ if (signal_detected && !error)
+ return -EITNR;

return error;
}
@@ -2342,7 +2345,7 @@ COMPAT_SYSCALL_DEFINE6(epoll_pwait, int, epfd,
const compat_sigset_t __user *, sigmask,
compat_size_t, sigsetsize)
{
- long err;
+ long err, signal_detected;
sigset_t ksigmask, sigsaved;

/*
@@ -2355,7 +2358,10 @@ COMPAT_SYSCALL_DEFINE6(epoll_pwait, int, epfd,

err = do_epoll_wait(epfd, events, maxevents, timeout);

- restore_user_sigmask(sigmask, &sigsaved);
+ signal_detected = restore_user_sigmask(sigmask, &sigsaved);
+
+ if (signal_detected && !err)
+ return -EITNR;

return err;
}
diff --git a/kernel/signal.c b/kernel/signal.c
index 3a9e41197d46..c76ab2a52ebf 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2849,11 +2849,11 @@ EXPORT_SYMBOL(set_compat_user_sigmask);
* This is useful for syscalls such as ppoll, pselect, io_pgetevents and
* epoll_pwait where a new sigmask is passed in from userland for the syscalls.
*/
-void restore_user_sigmask(const void __user *usigmask, sigset_t *sigsaved)
+int restore_user_sigmask(const void __user *usigmask, sigset_t *sigsaved)
{

if (!usigmask)
- return;
+ return 0;
/*
* When signals are pending, do not restore them here.
* Restoring sigmask here can lead to delivering signals that the above
@@ -2862,7 +2862,7 @@ void restore_user_sigmask(const void __user
*usigmask, sigset_t *sigsaved)
if (signal_pending(current)) {
current->saved_sigmask = *sigsaved;
set_restore_sigmask();
- return;
+ return 0;
}

-Deepa