Re: [PATCH v2 1/4] tracing: Added hardware latency tracer
From: Nilay Vaish
Date: Mon Aug 22 2016 - 13:28:56 EST
On 10 August 2016 at 08:53, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> diff --git a/kernel/trace/trace_hwlat.c b/kernel/trace/trace_hwlat.c
> new file mode 100644
> index 000000000000..08dfabe4e862
> --- /dev/null
> +++ b/kernel/trace/trace_hwlat.c
> @@ -0,0 +1,527 @@
> +/*
> + * trace_hwlatdetect.c - A simple Hardware Latency detector.
> + *
> + * Use this tracer to detect large system latencies induced by the behavior of
> + * certain underlying system hardware or firmware, independent of Linux itself.
> + * The code was developed originally to detect the presence of SMIs on Intel
> + * and AMD systems, although there is no dependency upon x86 herein.
> + *
> + * The classical example usage of this tracer is in detecting the presence of
> + * SMIs or System Management Interrupts on Intel and AMD systems. An SMI is a
> + * somewhat special form of hardware interrupt spawned from earlier CPU debug
> + * modes in which the (BIOS/EFI/etc.) firmware arranges for the South Bridge
> + * LPC (or other device) to generate a special interrupt under certain
> + * circumstances, for example, upon expiration of a special SMI timer device,
> + * due to certain external thermal readings, on certain I/O address accesses,
> + * and other situations. An SMI hits a special CPU pin, triggers a special
> + * SMI mode (complete with special memory map), and the OS is unaware.
> + *
> + * Although certain hardware-inducing latencies are necessary (for example,
> + * a modern system often requires an SMI handler for correct thermal control
> + * and remote management) they can wreak havoc upon any OS-level performance
> + * guarantees toward low-latency, especially when the OS is not even made
> + * aware of the presence of these interrupts. For this reason, we need a
> + * somewhat brute force mechanism to detect these interrupts. In this case,
> + * we do it by hogging all of the CPU(s) for configurable timer intervals,
> + * sampling the built-in CPU timer, looking for discontiguous readings.
> + *
> + * WARNING: This implementation necessarily introduces latencies. Therefore,
> + * you should NEVER use this tracer while running in a production
> + * environment requiring any kind of low-latency performance
> + * guarantee(s).
> + *
> + * Copyright (C) 2008-2009 Jon Masters, Red Hat, Inc. <jcm@xxxxxxxxxx>
> + * Copyright (C) 2013-2016 Steven Rostedt, Red Hat, Inc. <srostedt@xxxxxxxxxx>
> + *
> + * Includes useful feedback from Clark Williams <clark@xxxxxxxxxx>
> + *
> + * This file is licensed under the terms of the GNU General Public
> + * License version 2. This program is licensed "as is" without any
> + * warranty of any kind, whether express or implied.
> + */
> +#include <linux/kthread.h>
> +#include <linux/tracefs.h>
> +#include <linux/uaccess.h>
> +#include <linux/delay.h>
> +#include "trace.h"
> +
> +static struct trace_array *hwlat_trace;
> +
> +#define U64STR_SIZE 22 /* 20 digits max */
> +
> +#define BANNER "hwlat_detector: "
> +#define DEFAULT_SAMPLE_WINDOW 1000000 /* 1s */
> +#define DEFAULT_SAMPLE_WIDTH 500000 /* 0.5s */
> +#define DEFAULT_LAT_THRESHOLD 10 /* 10us */
> +
> +/* sampling thread*/
> +static struct task_struct *hwlat_kthread;
> +
> +static struct dentry *hwlat_sample_width; /* sample width us */
> +static struct dentry *hwlat_sample_window; /* sample window us */
> +
> +/* Save the previous tracing_thresh value */
> +static unsigned long save_tracing_thresh;
> +
> +/* If the user changed threshold, remember it */
> +static u64 last_tracing_thresh = DEFAULT_LAT_THRESHOLD * NSEC_PER_USEC;
> +
> +/* Individual latency samples are stored here when detected. */
> +struct hwlat_sample {
> + u64 seqnum; /* unique sequence */
> + u64 duration; /* delta */
> + u64 outer_duration; /* delta (outer loop) */
> + struct timespec timestamp; /* wall time */
> +};
> +
> +/* keep the global state somewhere. */
> +static struct hwlat_data {
> +
> + struct mutex lock; /* protect changes */
> +
> + u64 count; /* total since reset */
> +
> + u64 sample_window; /* total sampling window (on+off) */
> + u64 sample_width; /* active sampling portion of window */
> +
> +} hwlat_data = {
> + .sample_window = DEFAULT_SAMPLE_WINDOW,
> + .sample_width = DEFAULT_SAMPLE_WIDTH,
> +};
> +
> +static void trace_hwlat_sample(struct hwlat_sample *sample)
> +{
> + struct trace_array *tr = hwlat_trace;
> + struct trace_event_call *call = &event_hwlat;
Steven, where is this variable event_hwlat declared? To me it seems
that some macro is declaring it (most likely DEFINE_EVENT) but I was
not able to figure out the chain that ends up in the declaration.
--
Nilay