Re: [PATCH] x86/intel_rdt: use rdmsr_safe() to workaround AWS host issue

From: Vitaly Kuznetsov
Date: Thu Jan 10 2019 - 05:32:54 EST


Tony Luck <tony.luck@xxxxxxxxx> writes:

> On Wed, Jan 9, 2019 at 5:00 AM Borislav Petkov <bp@xxxxxxxxx> wrote:
>>
>> On Wed, Jan 09, 2019 at 01:09:31PM +0100, Vitaly Kuznetsov wrote:
>> > Hm, why is that? In theory, hypervisors can pass through or emulate the
>> > required MSRs...
>>
>> ...and when the theory becomes reality we'll remove the check.
>
> In practice that may be a long time coming. We don't have many CLOSIDs, or
> bits in a cache mask, at the h/w level. If you start trying to
> subdivide those resources to pass a subset to a guest, then you'll
> quickly find that you have no flexibility in the guest to do anything
> useful. It would only work if you limited to two, or perhaps three
> guests.

Running a single guest on a physical CPU is a very common scenario. In
fact, sharing cores is very rare for public clouds: e.g. all worthy
instance types on AWS/Azure give you dedicated cores and I don't see why
hypervisor can't pass through resctl features.

The other thing is: how can we be sure that there's no hypervisor
exposing these feature already? Even if open-source hypervisors like
KVM/Xen don't do it it doesn't prove anything: there are numerous
proprietary hypervisors and who knows what they do.

The original issue which triggered the discussion was discovered on AWS
Xen where the host is buggy and I suggested a simple short-term
workaround, I'm no expert in rdt/qos so I'm leaving this up to the
maintainers to decide which fix deserves to go in (if any).

--
Vitaly