Re: [PATCH v3 1/6] x86/alternatives: Teach text_poke_bp() to emulate instructions

From: Peter Zijlstra
Date: Tue Oct 08 2019 - 10:49:01 EST


On Tue, Oct 08, 2019 at 04:29:24PM +0200, Borislav Petkov wrote:
> On Mon, Oct 07, 2019 at 10:17:17AM +0200, Peter Zijlstra wrote:

> > @@ -63,8 +66,17 @@ static inline void int3_emulate_jmp(stru
> > regs->ip = ip;
> > }
> >
> > -#define INT3_INSN_SIZE 1
> > -#define CALL_INSN_SIZE 5
> > +#define INT3_INSN_SIZE 1
> > +#define INT3_INSN_OPCODE 0xCC
> > +
> > +#define CALL_INSN_SIZE 5
> > +#define CALL_INSN_OPCODE 0xE8
> > +
> > +#define JMP32_INSN_SIZE 5
> > +#define JMP32_INSN_OPCODE 0xE9
> > +
> > +#define JMP8_INSN_SIZE 2
> > +#define JMP8_INSN_OPCODE 0xEB
>
> You probably should switch those to have the name prefix come first and
> make them even shorter:
>
> OPCODE_CALL
> INSN_SIZE_CALL
> OPCODE_JMP32
> INSN_SIZE_JMP32
> OPCODE_JMP8
> ...
>
> This way you have the opcodes prefixed with OPCODE_ and the insn sizes
> with INSN_SIZE_. I.e., what they actually are.

I really don't like that; the important part is which instruction and
that really should come first. Also, your variant is horribly
inconsistent.

> > --- a/arch/x86/kernel/alternative.c
> > +++ b/arch/x86/kernel/alternative.c
>
> ...
>
> > @@ -1027,9 +1046,9 @@ NOKPROBE_SYMBOL(poke_int3_handler);
> > */
> > void text_poke_bp_batch(struct text_poke_loc *tp, unsigned int nr_entries)
> > {
> > - int patched_all_but_first = 0;
> > - unsigned char int3 = 0xcc;
> > + unsigned char int3 = INT3_INSN_OPCODE;
> > unsigned int i;
> > + int do_sync;
> >
> > lockdep_assert_held(&text_mutex);
> >
> > @@ -1053,16 +1072,16 @@ void text_poke_bp_batch(struct text_poke
> > /*
> > * Second step: update all but the first byte of the patched range.
> > */
> > - for (i = 0; i < nr_entries; i++) {
> > + for (do_sync = 0, i = 0; i < nr_entries; i++) {
> > if (tp[i].len - sizeof(int3) > 0) {
> > text_poke((char *)tp[i].addr + sizeof(int3),
> > - (const char *)tp[i].opcode + sizeof(int3),
> > + (const char *)tp[i].text + sizeof(int3),
> > tp[i].len - sizeof(int3));
> > - patched_all_but_first++;
> > + do_sync++;
> > }
> > }
> >
> > - if (patched_all_but_first) {
> > + if (do_sync) {
> > /*
> > * According to Intel, this core syncing is very likely
> > * not necessary and we'd be safe even without it. But
> > @@ -1075,10 +1094,17 @@ void text_poke_bp_batch(struct text_poke
> > * Third step: replace the first byte (int3) by the first byte of
> > * replacing opcode.
> > */
> > - for (i = 0; i < nr_entries; i++)
> > - text_poke(tp[i].addr, tp[i].opcode, sizeof(int3));
> > + for (do_sync = 0, i = 0; i < nr_entries; i++) {
>
> Can we have the do_sync reset outside of the loop?

Can, but why? That's more lines for no raisin ;-)

> > + if (tp[i].text[0] == INT3_INSN_OPCODE)
> > + continue;
>
> I'm guessing we preset the 0th byte to 0xcc somewhere.... I just can't
> seem to find it...

Very first pass, we write INT3's everywhere.