Re: [RFC PATCH 1/2] Marker probes in futex.c
From: Mathieu Desnoyers
Date: Tue Apr 15 2008 - 17:38:50 EST
* Ingo Molnar (mingo@xxxxxxx) wrote:
>
> * Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxx> wrote:
>
> > Now for some performance impact :
>
> absent from your figures is dyn-ftrace :-)
>
> dyn-ftrace is a young but rather nifty tool that extends on old-style
> static tracepoints rather forcefully, by adding 75,000+ tracepoints to
> the kernel here and now, with no in-source maintainence overhead at all.
> (It's a nice add-on to SystemTap because it adds space for a jprobe, but
> i digress ;-)
>
> anyway, my objections against markers would be reduced quite
> significantly, if we were able to improve the immediate values
> optimization to not actually patch the MOV's constant portion, but to
> just patch both the MOV and the branch instruction, into two NOPs.
>
> in fact, we can do even better: we should turn the immediate values
> optimization into a simple "patch out a jump to the slowpath"
> optimization. I.e. just a single instruction to patch in and out. (and
> that makes the NMI impact all that easier to handle as well)
>
> That would pretty much meet my "the typical trace point should have the
> cost of a single NOP" threshold for fastpath tracing overhead, which
> i've been repeating for 2 years now, and which would make the scheduler
> markers a lot more palatable ;-)
>
> hm?
>
I think we could do something there. Let's have a look at a few
marker+immediate values fast paths on x86_32 :
4631: b0 00 mov $0x0,%al
4633: 84 c0 test %al,%al
4635: 0f 85 c6 00 00 00 jne 4701 <try_to_wake_up+0xea>
7059: b0 00 mov $0x0,%al
705b: 84 c0 test %al,%al
705d: 75 63 jne 70c2 <sched_exec+0xb6>
83ac: b0 00 mov $0x0,%al
83ae: 84 c0 test %al,%al
83b0: 75 29 jne 83db <wait_task_inactive+0x69>
If we want to support NMI context and have the ability to instrument
preemptable code without too much headache, we must insure that every
modification will leave the code in a "correct" state and that we do not
grow the size of any reachable instruction. Also, we must insure gcc
did not put code between these instructions. Modifying non-relocatable
instructions would also be a pain, since we would have to deal with
instruction pointer relocation in the breakpoint code when the code
modification is being done.
Luckily, gcc almost never place any code between the mov, test and jne
instructions. But since we cannot we sure, we could dynamically check
for this code pattern after the mov instruction. If we find it, then we
play with it as if it was a single asm block, but if we don't find what
we expect, then we use standard immediate values for that. I expect the
heavily optimised version will be usable almost all the time.
This heavily optimized version could consist of a simple jump to the
address following the "jne" instruction. To activate the immediate
value, we could simply put back a mov $0x1,%al. I don't think we care
_that_ much about the active tracing performance, since we take a
supplementary function call already in that case.
We could probably force the mov into %al to make sure we search for only
one test pattern (%al,%al). We would have to decode the jne instruction
to see how big it is so we can put the correct offset in the jmp
instruction replacing the original mov.
The only problem that arises is if the gcc compiler uses the zero flag
set by testb by code following the jne instruction, but in our case, I
don't see how gcc could ever want to reuse the zero flag set by a test
on our own mov to a register unless we re-use the value loaded somewhere
else.
Dealing with the non-relocatable jmp instruction could be done by
checking, in the int3 immediate values notifiy callback, if the
instruction being modified is a jmp. If it is, we simply update the
return address without executing the bypass code.
What do you think of these ideas ?
Mathieu
--
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/