Re: [PATCH 4/4] [x86] perf: fix accidentally ack'ing a secondevent on intel perf counter
From: Robert Richter
Date: Thu Sep 02 2010 - 09:16:33 EST
On 02.09.10 04:13:19, Stephane Eranian wrote:
> Robert,
>
> Do you have the test program you used to test this?
> I believe the NHM hack does not solve the problem, it
> just makes it harder to appear.
For testing back-to-back nmis I have used:
perf record -e cycles -e instructions -e cache-references
-e cache-misses -e branch-misses -a -- sleep 10
with load on all cpus. But I couldn't reproduce this particular
problem as I do not have such a system available. I think it might
trigger also with only one counter running. What the observed from the
status bits, only one counter was involved.
>
> I suspect the real issue is that the GLOBAL_STATUS
> bitmask cannot be trusted. I'd like to verify this.
So yes, it looks like it is a cpu bug with a race then clearing the
status. I didn't check the errata list, maybe it is already known.
>
> Has the problem appear only on Nehalem or also on
> Westmere?
I don't know.
-Robert
--
Advanced Micro Devices, Inc.
Operating System Research Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/