Re: [RFC PATCH 00/13] Core scheduling v5

From: Joel Fernandes
Date: Wed Apr 15 2020 - 12:32:32 EST


On Tue, Apr 14, 2020 at 04:21:52PM +0200, Peter Zijlstra wrote:
> On Wed, Mar 04, 2020 at 04:59:50PM +0000, vpillai wrote:
> > TODO
> > ----
> > - Work on merging patches that are ready to be merged
> > - Decide on the API for exposing the feature to userland
> > - Experiment with adding synchronization points in VMEXIT to mitigate
> > the VM-to-host-kernel leaking
>
> VMEXIT is too late, you need to hook irq_enter(), which is what makes
> the whole thing so horrible.

We came up with a patch to do this as well. Currently testing it more and it
looks clean, will share it soon.

> > - Investigate the source of the overhead even when no tasks are tagged:
> > https://lkml.org/lkml/2019/10/29/242
>
> - explain why we're all still doing this ....
>
> Seriously, what actual problems does it solve? The patch-set still isn't
> L1TF complete and afaict it does exactly nothing for MDS.

The L1TF incompleteness is because of cross-HT attack from Guest vCPU
attacker to an interrupt/softirq executing on the other sibling correct? The
IRQ enter pausing the other sibling should fix that (which we will share in
a future series revision after adequate testing).

> Like I've written many times now, back when the world was simpler and
> all we had to worry about was L1TF, core-scheduling made some sense, but
> how does it make sense today?

For ChromeOS we're planning to tag each and every task seperately except for
trusted processes, so we are isolating untrusted tasks even from each other.

Sorry if this sounds like pushing my usecase, but we do get parallelism
advantage for the trusted tasks while still solving all security issues (for
ChromeOS). I agree that cross-HT user <-> kernel MDS is still an issue if
untrusted (tagged) tasks execute together on same core, but we are not
planning to do that on our setup at least.

> It's cute that this series sucks less than it did before, but what are
> we trading that performance for?

AIUI, the performance improves vs noht in the recent series. I am told that
is the case in recent postings of the series.

thanks,

- Joel