On Thu, Aug 06, 2020 at 11:18:27AM +0200, peterz@xxxxxxxxxxxxx wrote:
On Thu, Aug 06, 2020 at 10:26:29AM +0800, Jin, Yao wrote:
+static struct pt_regs *sanitize_sample_regs(struct perf_event *event, struct pt_regs *regs)
+{
+ struct pt_regs *sample_regs = regs;
+
+ /* user only */
+ if (!event->attr.exclude_kernel || !event->attr.exclude_hv ||
+ !event->attr.exclude_host || !event->attr.exclude_guest)
+ return sample_regs;
+
Is this condition correct?
Say counting user event on host, exclude_kernel = 1 and exclude_host = 0. It
will go "return sample_regs" path.
I'm not sure, I'm terminally confused on virt stuff.
[A]
Suppose we have nested virt:
L0-hv
|
G0/L1-hv
|
G1
And we're running in G0, then:
- 'exclude_hv' would exclude L0 events
- 'exclude_host' would ... exclude L1-hv events?
- 'exclude_guest' would ... exclude G1 events?
[B]
Then the next question is, if G0 is a host, does the L1-hv run in
G0 userspace or G0 kernel space?
I was assuming G0 userspace would not include anything L1 (kvm is a
kernel module after all), but what do I know.
@@ -11609,7 +11636,8 @@ SYSCALL_DEFINE5(perf_event_open,
if (err)
return err;
- if (!attr.exclude_kernel) {
+ if (!attr.exclude_kernel || !attr.exclude_callchain_kernel ||
+ !attr.exclude_hv || !attr.exclude_host || !attr.exclude_guest) {
err = perf_allow_kernel(&attr);
if (err)
return err;
I can understand the conditions "!attr.exclude_kernel || !attr.exclude_callchain_kernel".
But I'm not very sure about the "!attr.exclude_hv || !attr.exclude_host || !attr.exclude_guest".
Well, I'm very sure G0 userspace should never see L0 or G1 state, so
exclude_hv and exclude_guest had better be true.
On host, exclude_hv = 1, exclude_guest = 1 and exclude_host = 0, right?
Same as above, is G0 host state G0 userspace?
So even exclude_kernel = 1 but exclude_host = 0, we will still go
perf_allow_kernel path. Please correct me if my understanding is wrong.
Yes, because with those permission checks in place it means you have
permission to see kernel bits.
So if I understand 'exclude_host' wrong -- a distinct possibility -- can
we then pretty please have the above [A-B] corrected and put in a
comment near perf_event_attr and the exclude_* comments changed to refer
to that?