Re: arm64 syzbot instances

From: Arnd Bergmann
Date: Sun Mar 21 2021 - 15:01:20 EST


On Sat, Mar 20, 2021 at 9:43 PM Peter Maydell <peter.maydell@xxxxxxxxxx> wrote:
>
> On Fri, 12 Mar 2021 at 09:16, Arnd Bergmann <arnd@xxxxxxxx> wrote:
> > So it's probably qemu that triggers the 'synchronous external
> > abort' when accessing the PCI I/O space, which in turn hints
> > towards a bug in qemu. Presumably it only returns data from
> > I/O ports that are actually mapped to a device when real hardware
> > is supposed to return 0xffffffff when reading from unused I/O ports.
>
> Do you have a reference to the bit of the PCI spec that mandates
> this -1/discard behaviour for attempted access to places where
> there isn't actually a PCI device mapped ? The spec is pretty
> long and hard to read...
>
> (Knowing to what extent this behaviour is mandatory for all
> PCI systems/host controllers vs just "it would be nice if the
> gpex host controller worked this way" would help in figuring
> out where in QEMU to change.)

I spent some more time looking at both really old PCI specifications,
and new ones.
The old PCI specs seem to just leave this bit as out of scope because
it does not concern transactions on the bus. The PCI host controller
can either report a 'master abort' to the CPU, or ignore it, and each
bridge can decide to turn master aborts on reads into all 1s.
We do have support some SoCs in Linux that trigger a CPU exception,
but we tend to deal with those with an ugly hack that just ignores
all exceptions from the CPU. Most host bridges fortunately behave
like an x86 PC though, and do not trigger an exception here.

In the PCIe 4.0 specification, I found that the behavior is configurable
at the root port, using the 'RP PIO Exception Register' at offset 0x1c
in the DPC Extended Capability. This register defaults to '0', meaning
that reads from an unknown port that generate a 'Unsupported Request
Completion' get turned into all 1s. If the firmware or OS enables it,
this can be turned into an AER log event, generate an interrupt or
a CPU exception.

Linux has a driver for DPC, which apparently configures it to
cause an interrupt to log the event, but it does not hook up the
CPU exception handler to this. I don't see an implementation of DPC
in qemu, which I take as an indication that it should use the
default behavior and cause neither an interrupt nor a CPU exception.

Arnd