Re: [patch 1/4] add basic task isolation prctl interface

From: Marcelo Tosatti
Date: Tue Jul 27 2021 - 09:08:46 EST


On Tue, Jul 27, 2021 at 10:06:41AM -0300, Marcelo Tosatti wrote:
> On Tue, Jul 27, 2021 at 02:38:15PM +0200, nsaenzju@xxxxxxxxxx wrote:
> > Hi Marcelo,
> >
> > On Tue, 2021-07-27 at 08:00 -0300, Marcelo Tosatti wrote:
> > > On Tue, Jul 27, 2021 at 12:48:33PM +0200, nsaenzju@xxxxxxxxxx wrote:
> > > > On Tue, 2021-07-27 at 07:38 -0300, Marcelo Tosatti wrote:
> > > > > +Isolation mode (PR_ISOL_MODE):
> > > > > +------------------------------
> > > > > +
> > > > > +- PR_ISOL_MODE_NONE (arg4): no per-task isolation (default mode).
> > > > > + PR_ISOL_EXIT sets mode to PR_ISOL_MODE_NONE.
> > > > > +
> > > > > +- PR_ISOL_MODE_NORMAL (arg4): applications can perform system calls normally,
> > > > > + and in case of interruption events, the notifications can be collected
> > > > > + by BPF programs.
> > > > > + In this mode, if system calls are performed, deferred actions initiated
> > > > > + by the system call will be executed before return to userspace.
> > > > > +
> > > > > +Other modes, which for example send signals upon interruptions events,
> > > > > +can be implemented.
> > > >
> > > > Shouldn't this be a set of flags that enable specific isolation features?
> > > > Something the likes of 'PR_ISOL_QUIESCE_ON_EXIT'. Modes seem more restrictive
> > > > and too much of a commitment. If we merge MODE_NORMAL as is, we won't be able
> > > > to tweak/extend its behaviour in the future.
> > >
> > > Hi Nicolas,
> > >
> > > Well, its assuming PR_ISOL_MODE_NORMAL means "enable all isolation
> > > features on return to userspace".
> > >
> > > Later on, if desired, can add extend interface as follows (using
> > > Christoph's idea to not perform automatic quiesce on return to
> > > userspace, but expose which parts need quiescing
> > > so userspace can do it on its own, as an example):
> > >
> > > #define PR_ISOL_QUIESCE_ON_EXIT (1<<0)
> > > #define PR_ISOL_VSYSCALL_PAGE (1<<1)
> > > ...
> > >
> > > unsigned long bitmap = PR_ISOL_VSYSCALL_PAGE;
> > >
> > > /* allow system calls */
> > > prctl(PR_ISOL_SET, PR_ISOL_MODE, PR_ISOL_MODE_NORMAL, 0, 0, 0);
> > >
> > > /*
> > > * disable quiescing on exit, enable reporting through
> > > * vsyscall page
> > > */
> > > prctl(PR_ISOL_SET, PR_ISOL_FEATURES, &bitmap, 0, 0);
> > > /*
> > > * configure vsyscall page
> > > */
> > > prctl(PR_ISOL_VSYSCALLS, params, ...);
> > >
> > > So unless i am missing something, it is possible to tweak/extend the
> > > interface. No?
> >
> > OK, sorry if I'm being thick, but what is the benefit of having a distincnt
> > PR_ISOL_MODE instead expressing everything as PR_ISOL_FEATURES.
> >
> > PR_ISOL_MODE_NONE == Empty PR_ISOL_FEATURES bitmap
> >
> > PR_ISOL_MODE_NORMAL == Bitmap of commonly used PR_ISOL_FEATURES
> > (we could introduce a define)
> >
> > PR_ISOL_MODE_NORMAL+PR_ISOL_VSYSCALLS == Custom bitmap
> >
> > Other than that, my rationale is that if you extend PR_ISOL_MODE_NORMAL's
> > behaviour as new features are merged, wouldn't you be potentially breaking
> > userspace (i.e. older applications might not like the new default)?
> >
> > --
> > Nicolás Sáenz
>
> The main reason is that PR_ISOL_MODE would allow for distinct
> modes to be implemented (matching each use case). For example:
>
> https://lwn.net/Articles/816298/
>
> "When a task has finished its initialization, it can activate isolation
> by using the PR_TASK_ISOLATION operation provided by the prctl()
> system call. This operation may fail for either permanent or temporary
> reasons. An example of a permanent error is when the task is set up
> on a CPU without isolation; in this case, entering isolation mode
> is not possible. Temporary errors are indicated by the EAGAIN error
> code; examples include a time when the delayed workqueues could not be
> stopped. In such cases, the task may retry the operation if it wants to
> enter isolation, as it may succeed the next time.
>
> In the prctl() call, the developer may also configure the signal to be
> sent to the task when it loses isolation. The additional macro to use is
> PR_TASK_ISOLATION_SET_SIG(), passing it the signal to send. The command
> then becomes similar to the one in the example code:"

But have no strong preference: fine with PR_ISOL_FEATURES as you
describe above, and if that is the consensus, can resubmit.