Re: [PATCHv11 05/19] x86/relocate_kernel: Use named labels for less confusion

From: Andrew Cooper
Date: Wed May 29 2024 - 08:33:48 EST


On 29/05/2024 12:28 pm, Borislav Petkov wrote:
> On Wed, May 29, 2024 at 02:17:29PM +0300, Kirill A. Shutemov wrote:
>>> That jmp 1f becomes redundant now as it simply jumps 1 line below.
>>>
>> Nothing changed wrt this jump. It dates back to initial kexec
>> implementation.
>>
>> See 5234f5eb04ab ("[PATCH] kexec: x86_64 kexec implementation").
>>
>> But I don't see functional need in it.
>>
>> Anyway, it is outside of the scope of the patch.
> Yap, Kirill did what Nikolay should've done - git archeology. Please
> don't forget to do that next time.
>
> And back in the day they didn't comment non-obvious things because
> commenting is for losers. :-\
>
> So that unconditional forward jump either flushes branch prediction on
> some old uarch or something else weird, uarch-special.
>
> I doubt we can remove it just like that.
>
> Lemme add Andy - he should know.

Seems I've gained a reputation...

jmp 1f dates back to ye olde 8086, which started the whole trend of the
instruction pointer just being a figment of the ISA's imagination[1].

Hardware maintains the pointer to the next byte to fetch (the prefetch
queue was up to 6 bytes), and there was a micro-op to subtract the
current length of the prefetch queue from the accumulator.

In those days, the prefetch queue was not coherent with main memory, and
jumps (being a discontinuity in the instruction stream) simply flushed
the prefetch queue.

This was necessary after modifying executable code, because otherwise
you could end up executing stale bytes from the prefetch queue and then
non-stale bytes thereafter.  (Otherwise known as the way to distinguish
the 8086 from the 8088 because the latter only had a 4 byte prefetch queue.)

Anyway.  It's how you used to spell "serialising operation" before that
term ever entered the architecture.  Linux still supports CPUs prior to
the Pentium, so still needs to care about prefetch queues in the 486.

However, this example appears to be in 64bit code and following a write
to CR4 which will be fully serialising, so it's probably copy&paste from
32bit code where it would be necessary in principle.

~Andrew

[1]
https://www.righto.com/2023/01/inside-8086-processors-instruction.html#fn:pc

In fact, anyone who hasn't should read the entire series on the 8086,
https://www.righto.com/p/index.html