Re: [PATCH v1 2/2] perf/core: Fake regs for leaked kernel samples

From: peterz
Date: Thu Aug 06 2020 - 05:27:26 EST


On Thu, Aug 06, 2020 at 11:18:27AM +0200, peterz@xxxxxxxxxxxxx wrote:
> On Thu, Aug 06, 2020 at 10:26:29AM +0800, Jin, Yao wrote:
>
> > > +static struct pt_regs *sanitize_sample_regs(struct perf_event *event, struct pt_regs *regs)
> > > +{
> > > + struct pt_regs *sample_regs = regs;
> > > +
> > > + /* user only */
> > > + if (!event->attr.exclude_kernel || !event->attr.exclude_hv ||
> > > + !event->attr.exclude_host || !event->attr.exclude_guest)
> > > + return sample_regs;
> > > +
> >
> > Is this condition correct?
> >
> > Say counting user event on host, exclude_kernel = 1 and exclude_host = 0. It
> > will go "return sample_regs" path.
>
> I'm not sure, I'm terminally confused on virt stuff.

[A]

> Suppose we have nested virt:
>
> L0-hv
> |
> G0/L1-hv
> |
> G1
>
> And we're running in G0, then:
>
> - 'exclude_hv' would exclude L0 events
> - 'exclude_host' would ... exclude L1-hv events?
> - 'exclude_guest' would ... exclude G1 events?

[B]

> Then the next question is, if G0 is a host, does the L1-hv run in
> G0 userspace or G0 kernel space?
>
> I was assuming G0 userspace would not include anything L1 (kvm is a
> kernel module after all), but what do I know.
>
> > > @@ -11609,7 +11636,8 @@ SYSCALL_DEFINE5(perf_event_open,
> > > if (err)
> > > return err;
> > > - if (!attr.exclude_kernel) {
> > > + if (!attr.exclude_kernel || !attr.exclude_callchain_kernel ||
> > > + !attr.exclude_hv || !attr.exclude_host || !attr.exclude_guest) {
> > > err = perf_allow_kernel(&attr);
> > > if (err)
> > > return err;
> > >
> >
> > I can understand the conditions "!attr.exclude_kernel || !attr.exclude_callchain_kernel".
> >
> > But I'm not very sure about the "!attr.exclude_hv || !attr.exclude_host || !attr.exclude_guest".
>
> Well, I'm very sure G0 userspace should never see L0 or G1 state, so
> exclude_hv and exclude_guest had better be true.
>
> > On host, exclude_hv = 1, exclude_guest = 1 and exclude_host = 0, right?
>
> Same as above, is G0 host state G0 userspace?
>
> > So even exclude_kernel = 1 but exclude_host = 0, we will still go
> > perf_allow_kernel path. Please correct me if my understanding is wrong.
>
> Yes, because with those permission checks in place it means you have
> permission to see kernel bits.

So if I understand 'exclude_host' wrong -- a distinct possibility -- can
we then pretty please have the above [A-B] corrected and put in a
comment near perf_event_attr and the exclude_* comments changed to refer
to that?