Re: [RFC PATCH 2/8] jump label v4 - x86: Introduce generic jumppatching without stop_machine

From: Mathieu Desnoyers
Date: Mon Jan 18 2010 - 11:52:46 EST


* H. Peter Anvin (hpa@xxxxxxxxx) wrote:
> On 01/18/2010 07:59 AM, Masami Hiramatsu wrote:
> >>>>>
> >>>>> This part bothers me. The text_poke just writes over the text
> >>>>> directly (using a separate mapping). But if that memory is in the
> >>>>> pipeline of another CPU, I think this could cause a GPF.
> >>>>>
> >>>>
> >>>> Could you clarify why you think that?
> >>>
> >>> Basically, what Steven and I were concerned about in this particular
> >>> patch version is the fact that this code took a "shortcut" for
> >>> single-byte text modification, thus bypassing the int3-bypass scheme
> >>> altogether.
> >>
> >> single byte instruction updates are likely 100x safer than any scheme
> >> of multi-byte instruction scheme that I have seen, other than a full
> >> stop_machine().
> >>
> >> That does not mean it is safe, it just means it's an order of
> >> complexity less to analyze ;-)
> >
> > Yeah, so in the latest patch, I updated it to use int3 even if
> > len == 1. :-)
> >
>
> This really doesn't make much sense to me. The whole basis for the int3
> scheme itself is that single-byte updates are atomic, so if single-byte
> updates can't work -- and as I stated, we at Intel OTC currently believe
> it safe -- then int3 can't work either.

The additional characteristic of the int3 instruction (compared to the
general case of a single-byte instruction) is that, when executed, it
will trigger a trap, run a trap handler and return to the original code,
typically with iret. This therefore implies that a serializing
instruction is executed before returning to the instructions following
the modification site when the breakpoint is hit.

So I hand out to Intel's expertise the question of whether single-byte
instruction modification is safe or not in the general case. I'm just
pointing out that I can very well imagine an aggressive superscalar
architecture for which pipeline structure would support single-byte int3
patching without any problem due to the implied serialization, but would
not support the general-case single-byte modification due to its lack of
serialization.

As we might have to port this algorithm to Itanium in a near future, I
prefer to stay on the safe side. Intel's "by the book" recommendation is
more or less that a serializing instruction must be executed on all CPUs
before new code is executed, without mention of single-vs-multi byte
instructions. The int3-based bypass follows this requirement, but the
single-byte code patching does not.

Unless there is a visible performance gain to special-case the
single-byte instruction, I would recommend to stick to the safest
solution, which follows Intel "official" guide-lines too.

Thanks,

Mathieu

>
> The one thing to watch out for is that unless you force an IPI/IRET
> cycle afterwards, you can't know when any particular remote processor
> will see the update.
>
> -hpa
>
> --
> H. Peter Anvin, Intel Open Source Technology Center
> I work for Intel. I don't speak on their behalf.
>

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/