Re: [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system callfiltering

From: Ingo Molnar
Date: Thu May 26 2011 - 05:48:28 EST



* Ingo Molnar <mingo@xxxxxxx> wrote:

> You are missing the geniality of the tools/kvm/ thread pool! :-)
>
> It could be switched to a worker *process* model rather easily.
> Guest RAM and (a limited amount of) global resources would be
> shared via mmap(SHARED), but otherwise each worker process would
> have its own stack, its own subsystem-specific state, etc.

We get VM exit events in the vcpu threads which after minimal
processing pass much of the work to the thread pool. Most of the
virtio work (which could be a source of vulnerability - ringbuffers
are hard) is done in the worker task context.

It would be possible to further increase isolation there by also
passing the IO/MMIO decoding to the worker thread - but i'm not sure
that's truly needed. Most of the risk is where most of the code is -
and the code is in the worker task which interprets on-disk data,
protocols, etc.

So we could not only isolate devices from each other, but we could
also protect the highly capable vcpu fd from exploits in devices -
worker threads generally do not need access to the vcpu fd IIRC.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/