Re: [PATCH RESEND 2/5] seccomp: Add wait_killable semantic to seccomp user notifier

From: Sargun Dhillon
Date: Mon Apr 26 2021 - 18:15:58 EST


On Mon, Apr 26, 2021 at 01:02:29PM -0600, Tycho Andersen wrote:
> On Mon, Apr 26, 2021 at 11:06:07AM -0700, Sargun Dhillon wrote:
> > @@ -1103,11 +1111,31 @@ static int seccomp_do_user_notification(int this_syscall,
> > * This is where we wait for a reply from userspace.
> > */
> > do {
> > + interruptible = notification_interruptible(&n);
> > +
> > mutex_unlock(&match->notify_lock);
> > - err = wait_for_completion_interruptible(&n.ready);
> > + if (interruptible)
> > + err = wait_for_completion_interruptible(&n.ready);
> > + else
> > + err = wait_for_completion_killable(&n.ready);
> > mutex_lock(&match->notify_lock);
> > - if (err != 0)
> > +
> > + if (err != 0) {
> > + /*
> > + * There is a race condition here where if the
> > + * notification was received with the
> > + * SECCOMP_USER_NOTIF_FLAG_WAIT_KILLABLE flag, but a
> > + * non-fatal signal was received before we could
> > + * transition we could erroneously end our wait early.
> > + *
> > + * The next wait for completion will ensure the signal
> > + * was not fatal.
> > + */
> > + if (interruptible && !notification_interruptible(&n))
> > + continue;
>
> I'm trying to understand how one would hit this race,
>

I'm thinking:
P: Process that "generates" notification
S: Supervisor
U: User

P: Generated notification
S: ioctl(RECV...) // With wait_killable flag.
...complete is called in the supervisor, but the P may not be woken up...
U: kill -SIGTERM $P
...signal gets delivered to p and causes wakeup and
wait_for_completion_interruptible returns 1...

Then you need to check the race
> > @@ -1457,6 +1487,12 @@ static long seccomp_notify_recv(struct seccomp_filter *filter,
> > unotif.pid = task_pid_vnr(knotif->task);
> > unotif.data = *(knotif->data);
> >
> > + if (unotif.flags & SECCOMP_USER_NOTIF_FLAG_WAIT_KILLABLE) {
> > + knotif->wait_killable = true;
> > + complete(&knotif->ready);
> > + }
> > +
> > +
> > knotif->state = SECCOMP_NOTIFY_SENT;
> > wake_up_poll(&filter->wqh, EPOLLOUT | EPOLLWRNORM);
> > ret = 0;
>
> Seems like the idea is that if someone does a ioctl(RECV, ...) twice
> they'll hit it? But doesn't the test for NOTIFY_INIT and return
> -ENOENT above this hunk prevent that?
>
> Thanks,
>
> Tycho