Re: [PATCH v7 03/11] task_isolation: support PR_TASK_ISOLATION_STRICT mode

From: Chris Metcalf
Date: Tue Sep 29 2015 - 13:57:44 EST

On 09/29/2015 01:46 PM, Andy Lutomirski wrote:
On Tue, Sep 29, 2015 at 10:35 AM, Chris Metcalf <cmetcalf@xxxxxxxxxx> wrote:
Well, the most interesting category is things that don't actually
trigger a signal (e.g. minor page fault) since those are things that
cause significant issues with task isolation processes
(kernel-induced jitter) but aren't otherwise user-visible,
much like an undiscovered syscall in a third-party library
can cause unexpected jitter.
Would it make sense to exempt the exceptions that result in signals?
After all, those are detectable even without your patches. Going
through all of the exception types:

divide_error, overflow, invalid_op, coprocessor_segment_overrun,
invalid_TSS, segment_not_present, stack_segment, alignment_check:
these all send signals anyway.

double_fault is fatal.

bounds: MPX faults can be silently fixed up, and those will need
notification. (Or user code should know not to do that, since it
requires an explicit opt in, and user code can flip it back off to get
the signals.)

general_protection: always signals except in vm86 mode.

int3: silently fixed if uprobes are in use, but I don't think
isolation cares about that. Otherwise signals.

debug: The perf hw_breakpoint can result in silent fixups, but those
require explicit opt-in from the admin. Otherwise, unless there's a
bug or a debugger, the user will get a signal. (As a practical
matter, the only interesting case is the undocumented ICEBP

math_error, simd_coprocessor_error: Sends a signal.

spurious_interrupt_bug: Irrelevant on any modern CPU AFAIK. We should
just WARN if this hits.

device_not_available: If you're using isolation without an FPU, you
have bigger problems.

page_fault: Needs notification.

NMI, MCE: arguably these should *not* notify or at least not fatally.

So maybe a better approach would be to explicitly notify for the
relevant entries: IRQs, non-signalling page faults, and non-signalling
MPX fixups. Other arches would have their own lists, but they're
probably also short except for emulated instructions.

IRQs should get notified via the task_isolation_debug boot flag;
the intent is that they should never get delivered to nohz_full
cores anyway, so we produce a console backtrace if the boot
flag is enabled. This isn't tied to having a task running with
TASK_ISOLATION enabled, since it just shouldn't ever happen.

Thanks for reviewing the possible exception sources on x86,
which I'm less familiar with than tile. Non-signalling page faults
and MPX fixups sounds exactly right - and I didn't know about
MPX before your email (other than the userspace side of
the notion of bounds registers), so thanks for the pointer.

Chris Metcalf, EZChip Semiconductor

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at