Re: objtool clac/stac handling change..

From: Linus Torvalds
Date: Fri Jul 03 2020 - 21:54:59 EST


On Fri, Jul 3, 2020 at 5:50 PM Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
>
> How could prefetcht0 possibly
> raise an exception? Intel manual says that the only exception is #UD if
> LOCK PREFETCHT0 is encountered; not here, obviously. AMD manual simply
> says "no exceptions". Confused...

Several CPU bugs in this area. I think they may all have been AMD.

But we've definitely had "prefetch causes page faults" errata.

Google for it. One pdf (AMD errata) I found had this:

"Software Prefetches May Report A Page Fault

Description Software prefetch instructions are defined to ignore
page faults. Under highly specific and detailed internal
circumstances, a prefetch instruction may report a page fault if both
of the following conditions are true:

â The target address of the prefetch would cause a page fault if
the address was accessed by an actual memory load or store instruction
under the current privilege mode;

â The prefetch instruction is followed in execution-order by an
actual or speculative byte-sized memory access of the same
modify-intent to the same address. PREFETCH and PREFETCHNTA/0/1/2 have
the same modify-intent as a memory load access.

PREFETCHW has the same modify-intent as a memory store access. The
page fault exception error code bits for the faulting prefetch will be
identical to that for a bytesized memory access of the same-modify
intent to the same address. Note that some misaligned accesses can be
broken up by the processor into multiple accesses where at least one
of the accesses is a byte-sized access. If the target address of the
subsequent memory access of the same modify-intent is aligned and not
byte-sized, this errata does not occur and no workaround is needed.

Potential Effect on System An unexpected page fault may occur
infrequently on a prefetch instruction."

So sadly the architecture manuals do not reflect reality.

That said, software prefetch instructions very seldom actually work.
They are only useful if you have one _very_ specific load and run one
one _very_ specific micrcoarchiecture.

Ir's almost always a mistake to have them in the first place.

Linus