Re: [PATCH v1] KVM: x86: PMU Whitelist
From: Eric Hankland
Date: Fri May 31 2019 - 16:03:21 EST
On Thu, May 30, 2019 at 5:57 PM Wei Wang <wei.w.wang@xxxxxxxxx> wrote:
>
> On 05/30/2019 01:11 AM, Eric Hankland wrote:
> > On Wed, May 29, 2019 at 12:49 AM Wei Wang <wei.w.wang@xxxxxxxxx> wrote:
> >> On 05/29/2019 02:14 AM, Eric Hankland wrote:
> >>> On Mon, May 27, 2019 at 6:56 PM Wei Wang <wei.w.wang@xxxxxxxxx> wrote:
> >>>> On 05/23/2019 06:23 AM, Eric Hankland wrote:
> >>>>> - Add a VCPU ioctl that can control which events the guest can monitor.
> >>>>>
> >>>>> Signed-off-by: ehankland <ehankland@xxxxxxxxxx>
> >>>>> ---
> >>>>> Some events can provide a guest with information about other guests or the
> >>>>> host (e.g. L3 cache stats); providing the capability to restrict access
> >>>>> to a "safe" set of events would limit the potential for the PMU to be used
> >>>>> in any side channel attacks. This change introduces a new vcpu ioctl that
> >>>>> sets an event whitelist. If the guest attempts to program a counter for
> >>>>> any unwhitelisted event, the kernel counter won't be created, so any
> >>>>> RDPMC/RDMSR will show 0 instances of that event.
> >>>> The general idea sounds good to me :)
> >>>>
> >>>> For the implementation, I would have the following suggestions:
> >>>>
> >>>> 1) Instead of using a whitelist, it would be better to use a blacklist to
> >>>> forbid the guest from counting any core level information. So by default,
> >>>> kvm maintains a list of those core level events, which are not supported to
> >>>> the guest.
> >>>>
> >>>> The userspace ioctl removes the related events from the blacklist to
> >>>> make them usable by the guest.
> >>>>
> >>>> 2) Use vm ioctl, instead of vcpu ioctl. The blacklist-ed events can be
> >>>> VM wide
> >>>> (unnecessary to make each CPU to maintain the same copy).
> >>>> Accordingly, put the pmu event blacklist into kvm->arch.
> >>>>
> >>>> 3) Returning 1 when the guest tries to set the evetlsel msr to count an
> >>>> event which is on the blacklist.
> >>>>
> >>>> Best,
> >>>> Wei
> >>> Thanks for the feedback. I have a couple concerns with a KVM
> >>> maintained blacklist. First, I'm worried it will be difficult to keep
> >>> such a list up to date and accurate (both coming up with the initial
> >>> list since there are so many events, and updating it whenever any new
> >>> events are published or vulnerabilities are discovered).
> >> Not sure about "so many" above. I think there should be much
> >> fewer events that may need to be blacklisted.
> >>
> >> For example the event table 19-3 from SDM 19.2 shows hundreds of
> >> events, how many of them would you think that need to be blacklisted?
> >>
> >>> Second, users
> >>> may want to differentiate between whole-socket and sub-socket VMs
> >>> (some events may be fine for the whole-socket case) - keeping a single
> >>> blacklist wouldn't allow for this.
> >> Why wouldn't?
> >> In any case (e.g. the whole socket dedicated to the single VM) we
> >> want to unlock the blacklisted events, we can have the userspace
> >> (e.g. qemu command line options "+event1, +event2") do ioctl to
> >> have KVM do that.
> >>
> >> Btw, for the L3 cache stats event example, I'm not sure if that could
> >> be an issue if we have "AnyThread=0". I'll double confirm with
> >> someone.
> >>
> >> Best,
> >> Wei
> >> Not sure about "so many" above. I think there should be much
> >> fewer events that may need to be blacklisted.
> > I think you're right that there are not as many events that seem like
> > they could leak info as events that seem like they won't, but I think
> > the work to validate that they definitely don't could be expensive;
> > with a whitelist it's easy to start with a smaller set and
> > incrementally add to it without having to evaluate all the events
> > right away.
>
> Before going that whitelist/blacklist direction, do you have an event
> example that couldn't be solved by setting "AnyThread=0"?
>
> If no, I think we could simply gate guest's setting of "AnyThread=0".
>
> Best,
> Wei
With anythread=0, I'm not aware of any events that directly give info
about other VMs, but monitoring events related to shared resources
(e.g. LLC References and LLC Misses) could indirectly give you info
about how heavily other users are using that resource.
I tried returning 1 when the guest tries to write the eventsel msr for
a disallowed event - the behavior on modern guest kernels looks
reasonable (warns once about an unchecked MSR access error), but it
looks like guests using older kernels (older than 2016) might panic
due to the gpfault (not to mention I'm not sure about the behavior on
non-linux kernels). So I'm hesitant to return 1 - what do you think?
I also looked into moving from a vcpu ioctl to a vm ioctl - I can send
out a version of the patch with this change once we settle on the
other issues. It will involve some extra locking every time the
counters are programmed to ensure the whitelist or blacklist isn't
removed during access.
Eric