Re: Preemption Signal Management

From: Christian Brauner
Date: Wed May 26 2021 - 13:53:14 EST

On Tue, May 25, 2021 at 05:39:49PM -0700, Andy Lutomirski wrote:
> On 5/21/21 9:23 AM, Sargun Dhillon wrote:
> > Andy pointed out that we need a mechanism to determine whether or
> > notifications are preempted. He suggested we use EPOLLPRI to indicate
> > whether or not notifications are preempted. My outstanding question is
> > whether or not we need to be able to get insight of what caused the
> > preemption, and to which notification.
> >
> > In the past, Christian has suggested just background polling
> > notification IDs for validity, which is a fine mechanism to determine
> > that preemption has occurred. We could raise EPOLLPRI whenever a
> > notification has changed into the preempted state, but that would
> > require an O(n) operations across all outstanding notifications to
> > determine which one was preempted, and in addition, it doesn't give a
> > lot of information as to why the preemption occurred (fatal signal,
> > preemption?).
> >
> > In order to try to break this into small parts, I suggest:
> > 1. We make it so EPOLLPRI is raised (always) on preempted notifications
> > 2. We allow the user to set a flag to "track" notifications. If they
> > specify this flag, they can then run a "stronger" ioctl -- let's say
> > SECCOMP_IOCTL_NOTIF_STATUS, which, if the flag was specified upon
> > receiving the notification will return the current state of the
> > notification and if a signal preempted it, it will always do that.
> >
> > ---
> > Alternatively (and this is my preference), we add another filter flag,

In general this sounds good to me.

> > like SECCOMP_FILTER_FLAG_NOTIF_PREEMPT, which changes the behaviour

And make it combinable with SECCOMP_FILTER_FLAG_NEW_LISTENER, I like that.

> > to:
> > 1. Raise EPOLLPRI on preempted notifications
> > 2. All preemption notifications must be cleared via
> This seems sensible, except I don't think "preempted" is the right word.
> The state machine is pretty simple:
> live -> signaled -> killed
> (and we can go straight from live to killed, too.) So EPOLLPRI could be
> signaled if any notification changes state, and a new ioctl could read
> the list of notifications that have changed state.

A common case is will likely be to just rely the status to the
supervised task and not even perform some complicated action in the
So I think a way to optionally combine recv+send at the same time might
be a good idea. Either another ioctl which is a combined recv+send or a