Martin Bligh wrote:
It's looking to me like it might still need djprobes to implement, in
order to get the atomic and safe switchover from the original function
into the traced one. All rather sad, but seems to be true from all the
CPU errata, etc. If anyone can see a way round that, I'd love to hear
it.
But we don't need to fight the errata, there are fortunately solutions
that take care of it where it does exist (x86: djprobes/kprobes.)
What's more interesting, though, is that the method as it is proposed
at this stage *seems* to be easily portable to other archs. And where
such binary trickery is difficult to pull off, nothing precludes
having a universally "portable" mechanism including something akin to
switching between instrumented vs. normal function at function entry.
Even such conditional ifs can be optimized by the CPU nowadays.
The picture is, nevertheless, very bright at the moment (I think).
Just have a 5byte filler at function entry such as Hiramatsu-san
suggested, and use djprobes to fork to instrumented function. The
unconditional jump in the filler will most likely be utterly
unmeasurable, and benchmarks should confirm this.
So:
On x86: use 5byte filler and djprobes.
On "sane" archs: use filler and override as explained earlier.
Elsewhere: use standard "if" or function pointer at function entry.