Re: [PATCH 08/11] perf tool: precise mode requires exclude_guest

From: David Ahern
Date: Wed Sep 05 2012 - 11:43:53 EST


In an attempt to jump start this thread...

On 8/3/12 7:51 AM, Robert Richter wrote:
On 26.07.12 10:08:29, Peter Zijlstra wrote:
On Wed, 2012-07-25 at 23:16 -0600, David Ahern wrote:

Peter's patch (see https://lkml.org/lkml/2012/7/9/298) changes kernel
side to require the use of exclude_guest if the precise modifier is
used, returning -EOPNOTSUPP if exclude_guest is not set. This patch goes
after the user experience: Today if a user specifies -e <event>:p all
other modifiers are reset - including exclude_guest. Going forward we
need :p to imply :pH if a user has not specified a GH modifer.

We could do nothing and handle the unsupported error and try setting the
exclude_guest option - like perf handles other new parameters. But
EOPNOTSUPP is not uniquely tied to this error -- e.g., it could be the
BTS is not supported (:pp). Also, we have no easy way to discriminate :p
from :pG or :pGH. It seems to me perf should not silently undo a user
request on the modifier, but inform the user the request is wrong. For
example if a user request -e cycles:pG it should not be silently turned
into :pH.

And then yesterday, Robert stated that none of the exclude_xxxx
modifiers can be set for the AMD if the precise modifier is used, so we
cannot blindly set exclude_guest if precise_ip is set.

So, seems to me perf need's one action for Intel processors and another
for AMD.

No, we just need to teach the IBS code about SVM enter/exit.

I aggree that this could be emulated in software by enabling/disabling
the event with a guest/host switch. And, even better, we add this for
every pmu in a generic way. E.g. northbridge counter and I guess also
Intel uncore events do not support G/H counting in hardware. Same to
other pmus that could be imaginable in the future like counters for
IOMMUs or other hardware devices.

But, as some pmus are not related to virtualization or other features
they simply do not need to support those attributes, or we want other
defaults, e.g. enable it system wide. Detecting features with syscall
error checking and then falling back to other defaults does not seem
the right approach to me, because it may require several syscalls to
check *combinations* of supported attributes, makes error logging and
detection more difficult due to noisy log messages and because there
is no strict attribute flag checking in current and older kernels.

I better would like to see a pmu feature flag in the same style as
with /proc/cpuinfo, e.g.:

$ cat /sys/bus/event_source/devices/cpu/flags
exclude_host exclude_guest

We also need stricter attribute flag checking, esp. of reseved flags
and for unsupported features in some pmus (I already work on some
patches for this). Userland then checks flags and sets up syscalls
according to the reported flags. The goal should be to avoid syscall
errors at all. Thus, we are able to improve dmesg logging in case of
errors, currently we do not see any message if a syscall fails.

And finally, if a feature could be emulated, we could provide this
emulation of an attr flag to all pmus.

Does this make sense?

We need to require exclude_guest when using precise attribute with perf else all running VMs on Intel-based servers will crash. I do not have an AMD based server to even attempt the preferred solution. Best I can do is to attempt to keep this thread alive until someone with one can tackle the problem.

David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/