Re: [RFC PATCH 1/2] Marker probes in futex.c

From: Mathieu Desnoyers
Date: Wed Apr 16 2008 - 10:00:26 EST


* Arjan van de Ven (arjan@xxxxxxxxxxxxx) wrote:
>
> > > 4631: b0 00 mov $0x0,%al
> > > 4633: 84 c0 test %al,%al
> > > 4635: 0f 85 c6 00 00 00 jne 4701
>
> the use of partial registers here is unfortunate and probably quite expensive ;(
>
>

Yes, but it saves instruction cache. That's a tradeoff.

> > > If we want to support NMI context and have the ability to
> > > instrument preemptable code without too much headache, we must
> > > insure that every modification will leave the code in a "correct"
> > > state and that we do not grow the size of any reachable
> > > instruction. Also, we must insure gcc did not put code between
> > > these instructions. Modifying non-relocatable instructions would
> > > also be a pain, since we would have to deal with instruction
> > > pointer relocation in the breakpoint code when the code
> > > modification is being done.
>
> you also need to make sure no cpu is executing that code ever..
> but you already deal with that right?
>

By "insure that every modification will leave the code in a "correct"
state", I mean that at any given time before, during or after the code
modification, if an NMI comes on any CPU and try to run the modified
code, it should have a valid version of the code to execute. Does it
make more sense ?

> > >
> > > Luckily, gcc almost never place any code between the mov, test and
> > > jne instructions. But since we cannot we sure, we could dynamically
> > > check for this code pattern after the mov instruction. If we find
> > > it, then we play with it as if it was a single asm block, but if we
> > > don't find what we expect, then we use standard immediate values
> > > for that. I expect the heavily optimised version will be usable
> > > almost all the time.
>
> I expect gcc to start using the macro-fusion capable ones more and more over time at least,
> and for that the compare and jmp need to be consecutive.
>

Early reasults of the work I've done last night : I can detect about 96%
of the ~120 markers I've put in my instrumented kernel.

Not only does the compare and jmp need to be consecutive, but the movb
$0x0,%al also does. I *could* try to detect specific code inserted in
between, but I really have to make sure I don't get burned by the
compiler inserting a jmp there.

I'll post my work shortly.

Mathieu

>
> --
> If you want to reach me at my work email, use arjan@xxxxxxxxxxxxxxx
> For development, discussion and tips for power savings,
> visit http://www.lesswatts.org

--
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/