Re: Checking to see if a bit is _not_ set in a ftrace event filter

From: Alexei Starovoitov
Date: Tue Dec 02 2014 - 00:58:38 EST


On Mon, Dec 1, 2014 at 9:04 PM, Theodore Ts'o <tytso@xxxxxxx> wrote:
> On Mon, Dec 01, 2014 at 07:52:11PM -0800, Alexei Starovoitov wrote:
>
>> It will change the workflow for folks who use 'echo expr > filter'
>> directly. trace-cmd -e -f can be made to work transparently
>> with new features.
>
> This will break a bunch of **really** useful scripts found at:

I didn't mean that new features will break old. All existing filters
will stay as-is.

>> One of the goals for eBPF+tracing is to minimize
>> additions of new tracepoints. Right now we already
>> have a ton of them. events/ext4.h is ~2500 lines.
>> Some of them look like hooks for in-production
>> debugging of a function at a time. Sort of like poor's man
>> kprobe/kretprobe.
>
> Well, except that kprobe and kretprobe can't give me the arguments
> passed into the function (unless you compile with full -g debugging
> info enabled and bloat the object files and compilation time by a
> factor of 10 --- which I can't stand and why I use ftrace instead of
> systemtap :-)

well, dwarf is only needed if you have > 6 function
arguments in x64, since x64 ABI promotes scalars
to 64-bit and passes first six args in registers, so it's
trivial to know where arguments are even without debug info.
Similar situation is on most 64-bit risc archs.
dwarf is needed when one wants to see a value of
local C variable somewhere in the middle of the function,
but it's not common and rarely works in practice, since
var-tracking is not easy for compilers.

Another reason people say that dwarf is 'must have'
is to access struct fields. The 'inode->i_state' dereference
would require stap to use dwarf to know the offset of 'i_state'.
For eBPF we don't need debug info in such case,
since C compiler does it for us.
We can just do 'offsetof(typeof(*inode), i_state)'
as part of eBPF C program and llvm will figure out
correct field offset during compile time of eBPF program.
(for this to work, one need to have matching kernel
headers).

> If eBPF can solve the ability to be able to get at the critical
> function variables ...

If 'critical' function variables are arguments or variables
accessible via pointer walking (like inode->i_sb->s_fs_info)
then eBPF with kernel headers will do the job.
(For arguments only no headers needed)

> But that's why I have the trace_func_enter() and trace_func_exit()
> calls; I need to be able to get do various run-time debugging without
> needing to recompile the kernel and without forcing all of my
> development builds to have full debug info.

Understood. Thanks for sharing!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/