Re: [PATCH v2] perf/core: Drop kernel samples even though :u is specified

From: Jin, Yao
Date: Tue May 23 2017 - 08:32:47 EST


SNIP
+static bool skid_kernel_samples(struct perf_event *event, struct pt_regs *regs)
The name is a bit opaque, especially where it is used in
__perf_event_overflow().

How about we invert the polarity and call this sample_is_allowed() ?

That's OK, thanks.

+{
+ /*
+ * We may get kernel samples even though exclude_kernel
+ * is specified due to potential skid in sampling.
+ * The skid kernel samples could be dropped or just do
+ * nothing by testing the flag PERF_PMU_CAP_NO_SKID.
+ */
+ if (event->pmu->capabilities & PERF_PMU_CAP_NO_SKID)
+ return false;
Do we need this new cap?

I'd expect user_mode(regs) to be about as cheap as testing the cap, and
the common case is going to be that we we have test both.

For those PMUs without skid, when not sampling the kernel,
user_mode(regs) should always be true.

IMO, it would make more sense to just check user_mode(regs), which also
avoids any surprises with unexpected skid...
I guess the reason which Peter recommends to use a new cap is to have a way to keep original behavior.

If we don't need to keep original behavior, I think the new cap is not necessary. Could Peter provide comment? Thanks!

+
+ if (event->attr.exclude_kernel &&
+ !user_mode(regs) &&
+ (event->attr.sample_type & PERF_SAMPLE_IP)) {
+ return true;
+ }
+
+ return false;
+}
How about:

static bool sample_is_allowed(struct perf_event *event, struct pt_regs *regs)
{
/*
* Due to interrupt latency (AKA "skid"), we may enter the
* kernel before taking an overflow, even if the PMU is only
* counting user events.
*
* To avoid leaking information to userspace, we must always
* reject kernel samples when exclude_kernel is set.
*/
if (!user_mode(regs) && event->attr.exclude_kernel &&
(event->attr.sample_type & PERF_SAMPLE_IP))
return false;

return true;
}

... do we need to reject any other sample types, or do we definitely
avoid leaks by other means?
I just think only when the PERF_SAMPLE_IP is applied, we can get correct ip. So I check the PERF_SAMPLE_IP here.

+
/*
* Generic event overflow handling, sampling.
*/
@@ -7337,6 +7357,12 @@ static int __perf_event_overflow(struct perf_event *event,
ret = __perf_event_account_interrupt(event, throttle);
/*
+ * For security, drop the skid kernel samples if necessary.
+ */
+ if (skid_kernel_samples(event, regs))
+ return ret;
+
.. with the above changes, this can be:

if (!sample_is_allowed(event, regs))
return ret;

Thanks,
Mark.
OK, thanks! I will change the patch according to your comments.

Thanks
Jin Yao