On Sat, Aug 31, 2024 at 07:43:16AM +0000, Luo Gengkun wrote:
Perf events has the notion of sampling frequency which is implemented inThis one I'm less happy with.. that condition 'period != tick_stamp'
software by dynamically adjusting the counter period so that samples occur
at approximately the target frequency. Period adjustment is done in 2
places:
- when the counter overflows (and a sample is recorded)
- each timer tick, when the event is active
The later case is slightly flawed because it assumes that the time since
the last timer-tick period adjustment is 1 tick, whereas the event may not
have been active (e.g. for a task that is sleeping).
Fix by using jiffies to determine the elapsed time in that case.
Signed-off-by: Luo Gengkun <luogengkun@xxxxxxxxxxxxxxx>
Reviewed-by: Adrian Hunter <adrian.hunter@xxxxxxxxx>
---
include/linux/perf_event.h | 1 +
kernel/events/core.c | 12 +++++++++---
2 files changed, 10 insertions(+), 3 deletions(-)
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 1a8942277dda..d29b7cf971a1 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -265,6 +265,7 @@ struct hw_perf_event {
* State for freq target events, see __perf_event_overflow() and
* perf_adjust_freq_unthr_context().
*/
+ u64 freq_tick_stamp;
u64 freq_time_stamp;
u64 freq_count_stamp;
#endif
diff --git a/kernel/events/core.c b/kernel/events/core.c
index a9395bbfd4aa..183291e0d070 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -55,6 +55,7 @@
#include <linux/pgtable.h>
#include <linux/buildid.h>
#include <linux/task_work.h>
+#include <linux/jiffies.h>
#include "internal.h"
@@ -4120,9 +4121,11 @@ static void perf_adjust_freq_unthr_events(struct list_head *event_list)
{
struct perf_event *event;
struct hw_perf_event *hwc;
- u64 now, period = TICK_NSEC;
+ u64 now, period, tick_stamp;
s64 delta;
+ tick_stamp = jiffies64_to_nsecs(get_jiffies_64());
+
list_for_each_entry(event, event_list, active_list) {
if (event->state != PERF_EVENT_STATE_ACTIVE)
continue;
@@ -4148,6 +4151,9 @@ static void perf_adjust_freq_unthr_events(struct list_head *event_list)
*/
event->pmu->stop(event, PERF_EF_UPDATE);
+ period = tick_stamp - hwc->freq_tick_stamp;
+ hwc->freq_tick_stamp = tick_stamp;
+
now = local64_read(&event->count);
delta = now - hwc->freq_count_stamp;
hwc->freq_count_stamp = now;
@@ -4157,9 +4163,9 @@ static void perf_adjust_freq_unthr_events(struct list_head *event_list)
* reload only if value has changed
* we have stopped the event so tell that
* to perf_adjust_period() to avoid stopping it
- * twice.
+ * twice. And skip if it is the first tick adjust period.
*/
- if (delta > 0)
+ if (delta > 0 && likely(period != tick_stamp))
perf_adjust_period(event, period, delta, false);
event->pmu->start(event, delta > 0 ? PERF_EF_RELOAD : 0);
doesn't make sense to me. That's only false if hwc->freq_tick_stamp ==
0, which it will only be once after event creation. Even through the
Changelog babbles about event scheduling.
Also, that all should then be written something like:
if (delta > 0 && ...) {
perf_adjust_period(...);
adjusted = true;
}
event->pmu->start(event, adjusted ? PERF_EF_RELOAD : 0);