Re: [PATCH v2 00/21] libtracefs: Introducing tracefs_sql() to create synthetice events with an SQL line
From: Ahmed S. Darwish
Date: Wed Aug 04 2021 - 07:57:10 EST
Hi Steven,
On Tue, Aug 03, 2021 , Steven Rostedt wrote:
>
> Major update since v1:
>
> It was brought to my attention that the man page did not state that the
> SQL syntax required JOIN .. ON in the statement. That is, they were not
> optional. I decided to fix that. But not by updating the man page, but by
> actually making JOIN .. ON optional. If you leave that out, the synthetic
> event will not be completely created, but it will have enough to create
> a histogram. See the bottom (HISTOGRAMS) for more info!
>
...
>
> HISTOGRAMS
>
> Simple SQL statements without the JOIN ON may also be used, which will
> create a histogram instead. When doing this, the struct tracefs_hist
> descriptor can be retrieved from the returned synthetic event descriptor via
> the tracefs_synth_get_start_hist(3).
>
Thanks a lot! Actually, I meant going even one step further ;)
I was imagining something like the following:
$ trace-cmd sql-shell # OR
$ perf tracefs-sql-shell
Welcome to tracefs SQL shell...
> SELECT PNAME(common_pid),msr,val
FROM write_msr
WHERE msr=72 OR msr=2096
.-------------------------------------------.
| PNAME(common_pid) | msr | val |
|---------------------|------ |-------------|
| qemu-system-x86 | 0x48 | 0 |
| qemu-system-x86 | 0x48 | 0 |
| qemu-system-x86 | 0x48 | 0 |
| kworker/u16:2 | 0x830 | 0x1000008fb |
| .... | .... | ..... |
+-------------------------------------------+
> SELECT MAX(end.TIMESTAMP_USECS - start.TIMESTAMP_USECS) AS MaxSystemLatency_us,
PNAME(common_pid)
FROM sched_waking AS start JOIN sched_switch AS end
ON start.pid = stop.next_pid
.-------------------------------------------.
| MaxSystemLatency_us | PNAME(common_pid) |
|---------------------|---------------------|
| 350 | cyclictest |
+-------------------------------------------+
> SELECT (end.TIMESTAMP_USECS - start.TIMESTAMP_USECS) AS latency,
PNAME(common_pid), PRIO(common_pid)
FROM sched_waking AS start JOIN sched_switch AS end
ON start.pid = stop.next_pid
ORDER BY latency DESC
LIMIT 5
.----------------------------------------------------------.
| Latency | PNAME(common_pid) | PRIO(common_pid) |
|---------|-----------------------------|------------------|
| 829 | cyclictest | SCHED_FIFO:98 |
| 400 | cyclictest | SCHED_FIFO:98 |
| 192 | pulseaudio-rt | SCHED_RR:48 |
| 30 | firefox | SCHED_OTHER:0:0 |
| 10 | kworker/0:0H-events_highpri | SCHED_OTHER:0:-20|
+----------------------------------------------------------+
> SELECT (end.TIMESTAMP_USECS - start.TIMESTAMP_USECS) as MaxIRQLatency_us
FROM irq_disable as start JOIN irq_enable as end
ON start.common_pid = end.common_pid,
start.parent_offs == end.parent_offs
ORDER BY max_irq_disable
LIMIT 1
.------------------.
| MaxIRQLatency_us |
|------------------|
| 37 |
+------------------+
And so on....
The idea was that since the community already picked SQL as a
higher-level tracing language, why hard-code the SQL language with
synthetic events and histograms?
The language can alredy offer something *way more generic*, out of the
box, while still covering the desired special cases.
We can support the standard SQL aggregate functions (e.g., MAX(), MIN(),
SUM(), COUNT(), DISTINCT(), AVG(), etc.) + some kernel-specific
functions (e.g., PROCESS_NAME(), PROCESS_PRIO(), USECS(), etc.) + the
standard SQL keyworkds like ORDER BY, LIMIT, DESC, ASC, etc. This would
offer some nice friendly competition to BPF tracing, while still being a
(relatively) simple *query-only* language.
I'm not sure if you would be OK with this, but I thought a proposal
won't hurt :)
I can also write some patches on top of this series if you are OK with
the principle in general.
Kind regards,
--
Ahmed S. Darwish
Linutronix GmbH