Re: [PATCH v2] signal: Adjust error codes according to restore_user_sigmask()

From: Oleg Nesterov
Date: Wed May 22 2019 - 12:17:03 EST


On 05/22, Deepa Dinamani wrote:
>
> -Deepa
>
> > On May 22, 2019, at 8:05 AM, Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
> >
> >> On 05/21, Deepa Dinamani wrote:
> >>
> >> Note that this patch returns interrupted errors (EINTR, ERESTARTNOHAND,
> >> etc) only when there is no other error. If there is a signal and an error
> >> like EINVAL, the syscalls return -EINVAL rather than the interrupted
> >> error codes.
> >
> > Ugh. I need to re-check, but at first glance I really dislike this change.
> >
> > I think we can fix the problem _and_ simplify the code. Something like below.
> > The patch is obviously incomplete, it changes only only one caller of
> > set_user_sigmask(), epoll_pwait() to explain what I mean.
> > restore_user_sigmask() should simply die. Although perhaps another helper
> > makes sense to add WARN_ON(test_tsk_restore_sigmask() && !signal_pending).
>
> restore_user_sigmask() was added because of all the variants of these
> syscalls we added because of y2038 as noted in commit message:
>
> signal: Add restore_user_sigmask()
>
> Refactor the logic to restore the sigmask before the syscall
> returns into an api.
> This is useful for versions of syscalls that pass in the
> sigmask and expect the current->sigmask to be changed during
> the execution and restored after the execution of the syscall.
>
> With the advent of new y2038 syscalls in the subsequent patches,
> we add two more new versions of the syscalls (for pselect, ppoll
> and io_pgetevents) in addition to the existing native and compat
> versions. Adding such an api reduces the logic that would need to
> be replicated otherwise.

Again, I need to re-check, will continue tomorrow. But so far I am not sure
this helper can actually help.

> > --- a/fs/eventpoll.c
> > +++ b/fs/eventpoll.c
> > @@ -2318,19 +2318,19 @@ SYSCALL_DEFINE6(epoll_pwait, int, epfd, struct epoll_event __user *, events,
> > size_t, sigsetsize)
> > {
> > int error;
> > - sigset_t ksigmask, sigsaved;
> >
> > /*
> > * If the caller wants a certain signal mask to be set during the wait,
> > * we apply it here.
> > */
> > - error = set_user_sigmask(sigmask, &ksigmask, &sigsaved, sigsetsize);
> > + error = set_user_sigmask(sigmask, sigsetsize);
> > if (error)
> > return error;
> >
> > error = do_epoll_wait(epfd, events, maxevents, timeout);
> >
> > - restore_user_sigmask(sigmask, &sigsaved);
> > + if (error != -EINTR)
>
> As you address all the other syscalls this condition becomes more and
> more complicated.

May be.

> > --- a/include/linux/sched/signal.h
> > +++ b/include/linux/sched/signal.h
> > @@ -416,7 +416,6 @@ void task_join_group_stop(struct task_struct *task);
> > static inline void set_restore_sigmask(void)
> > {
> > set_thread_flag(TIF_RESTORE_SIGMASK);
> > - WARN_ON(!test_thread_flag(TIF_SIGPENDING));
>
> So you always want do_signal() to be called?

Why do you think so? No. This is just to avoid the warning, because with the
patch I sent set_restore_sigmask() is called "in advance".

> You will have to check each architecture's implementation of
> do_signal() to check if that has any side effects.

I don't think so.

> Although this is not what the patch is solving.

Sure. But you know, after I tried to read the changelog, I am not sure
I understand what exactly you are trying to fix. Could you please explain
this part

The behavior
before 854a6ed56839a was that the signals were dropped after the error
code was decided. This resulted in lost signals but the userspace did not
notice it

? I fail to understand it, sorry. It looks as if the code was already buggy before
that commit and it could miss a signal or something like this, but I do not see how.

Oleg.