On 9/23/20 2:33 AM, Qian Cai wrote:
On Fri, 2020-08-07 at 12:18 +0200, Cédric Le Goater wrote:
When a passthrough IO adapter is removed from a pseries machine using
hash MMU and the XIVE interrupt mode, the POWER hypervisor expects the
guest OS to clear all page table entries related to the adapter. If
some are still present, the RTAS call which isolates the PCI slot
returns error 9001 "valid outstanding translations" and the removal of
the IO adapter fails. This is because when the PHBs are scanned, Linux
maps automatically the INTx interrupts in the Linux interrupt number
space but these are never removed.
To solve this problem, we introduce a PPC platform specific
pcibios_remove_bus() routine which clears all interrupt mappings when
the bus is removed. This also clears the associated page table entries
of the ESB pages when using XIVE.
For this purpose, we record the logical interrupt numbers of the
mapped interrupt under the PHB structure and let pcibios_remove_bus()
do the clean up.
Since some PCI adapters, like GPUs, use the "interrupt-map" property
to describe interrupt mappings other than the legacy INTx interrupts,
we can not restrict the size of the mapping array to PCI_NUM_INTX. The
number of interrupt mappings is computed from the "interrupt-map"
property and the mapping array is allocated accordingly.
Cc: "Oliver O'Halloran" <oohall@xxxxxxxxx>
Cc: Alexey Kardashevskiy <aik@xxxxxxxxx>
Signed-off-by: Cédric Le Goater <clg@xxxxxxxx>
Some syscall fuzzing will trigger this on POWER9 NV where the traces pointed to
this patch.
.config: https://gitlab.com/cailca/linux-mm/-/blob/master/powerpc.config
OK. The patch is missing a NULL assignement after kfree() and that
might be the issue.
I did try PHB removal under PowerNV, so I would like to understand
how we managed to remove twice the PCI bus and possibly reproduce.
Any chance we could grab what the syscall fuzzer (syzkaller) did ?