Re: [PATCH] checkpatch: Check for .byte-spelled insn opcodes documentation on x86

From: Borislav Petkov
Date: Mon Oct 12 2020 - 10:22:00 EST


On Sat, Oct 10, 2020 at 09:47:59AM -0700, Joe Perches wrote:
> > '/\s*\.byte\s+(?:0x[0-9a-f]{1,2}[\s,]*){2,}/i'
> ^^^ ^
> now useless without the "

There are \.byte specifications without " so not useless.

> matches .BYTE

so what. It would have failed before, when trying to compile it.

> you probably want (?i:0x[etc...]
>
> I'd prefer to add an upper bound to the {m,n} use.
> Unbounded multiple
> matches {m,} can cause perl aborts.

Ok, we can make that 15. Max insn length on x86 is 15 bytes and that
is unrealistically high for this use case so we should be good. And the
range must be {1,15} because you can have single-byte instructions.

And that's fine if there are *some* false positives. And whatever we do,
it won't match everything. For example:

arch/x86/include/asm/fpu/internal.h:208:#define XSAVE ".byte " REX_PREFIX "0x0f,0xae,0x27"
arch/x86/include/asm/fpu/internal.h:209:#define XSAVEOPT ".byte " REX_PREFIX "0x0f,0xae,0x37"
arch/x86/include/asm/fpu/internal.h:210:#define XSAVES ".byte " REX_PREFIX "0x0f,0xc7,0x2f"
arch/x86/include/asm/fpu/internal.h:211:#define XRSTOR ".byte " REX_PREFIX "0x0f,0xae,0x2f"
arch/x86/include/asm/fpu/internal.h:212:#define XRSTORS ".byte " REX_PREFIX "0x0f,0xc7,0x1f"

but that's fine. I prefer for the regex to remain readable and single
outliers like those are caught in manual review.

As another example, sometimes it would be a false positive for another
reason:

arch/x86/include/asm/idtentry.h:500: * Note, that the 'pushq imm8' is emitted via '.byte 0x6a, vector' because

that's why I've changed the text to say "Please consider..." implying
thatdocumenting binutils version might not always be necessary/needed.

All in all, it's fine if there are some false positives and it can make
reviewers have a second look.

> This regex would also match
>
> .byte 0x020x02
>
> (which admittedly wouldn't compile, but I've seen really bad patches
> submitted too)

That's fine - I love reviewing !compiled patches. They will never send
!compiled again.

> A readability convenience would be to add and use:
>
> our $Hex_byte = qr{(?i)0x[0-9a-f]{1,2}\b};
>
> So if the minimum length if the isns .byte block is 2,
> with a separating comma then the regex could be:
>
> /\.byte\s+$Hex_byte\s*,\s*$Hex_byte\b/
>
> which I think is pretty readable.

Yap, makes sense. v3 coming up...

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette