Re: nvme may get timeout from dd when using different non-prefetch mmio outbound/ranges
From: Keith Busch
Date: Mon Oct 25 2021 - 12:22:06 EST
On Mon, Oct 25, 2021 at 10:47:39AM -0500, Bjorn Helgaas wrote:
> [+cc Tom (Cadence maintainer), NVMe folks]
>
> On Fri, Oct 22, 2021 at 10:08:20AM +0000, Li Chen wrote:
> > pciec: pcie-controller@2040000000 {
> > compatible = "cdns,cdns-pcie-host";
> > device_type = "pci";
> > #address-cells = <3>;
> > #size-cells = <2>;
> > bus-range = <0 5>;
> > linux,pci-domain = <0>;
> > cdns,no-bar-match-nbits = <38>;
> > vendor-id = <0x17cd>;
> > device-id = <0x0100>;
> > reg-names = "reg", "cfg";
> > reg = <0x20 0x40000000 0x0 0x10000000>,
> > <0x20 0x00000000 0x0 0x00001000>; /* RC only */
> >
> > /*
> > * type: 0x00000000 cfg space
> > * type: 0x01000000 IO
> > * type: 0x02000000 32bit mem space No prefetch
> > * type: 0x03000000 64bit mem space No prefetch
> > * type: 0x43000000 64bit mem space prefetch
> > * The First 16MB from BUS_DEV_FUNC=0:0:0 for cfg space
> > * <0x00000000 0x00 0x00000000 0x20 0x00000000 0x00 0x01000000>, CFG_SPACE
> > */
> > ranges = <0x01000000 0x00 0x00000000 0x20 0x00100000 0x00 0x00100000>,
> > <0x02000000 0x00 0x08000000 0x20 0x08000000 0x00 0x08000000>;
> >
> > #interrupt-cells = <0x1>;
> > interrupt-map-mask = <0x00 0x0 0x0 0x7>;
> > interrupt-map = <0x0 0x0 0x0 0x1 &gic 0 229 0x4>,
> > <0x0 0x0 0x0 0x2 &gic 0 230 0x4>,
> > <0x0 0x0 0x0 0x3 &gic 0 231 0x4>,
> > <0x0 0x0 0x0 0x4 &gic 0 232 0x4>;
> > phys = <&pcie_phy>;
> > phy-names="pcie-phy";
> > status = "ok";
> > };
> >
> >
> > After some digging, I find if I change the controller's range
> > property from
> >
> > <0x02000000 0x00 0x08000000 0x20 0x08000000 0x00 0x08000000> into
> > <0x02000000 0x00 0x00400000 0x20 0x00400000 0x00 0x08000000>,
> >
> > then dd will success without timeout. IIUC, range here
> > is only for non-prefetch 32bit mmio, but dd will use dma (maybe cpu
> > will send cmd to nvme controller via mmio?).
Generally speaking, an nvme driver notifies the controller of new
commands via a MMIO write to a specific nvme register. The nvme
controller fetches those commands from host memory with a DMA.
One exception to that description is if the nvme controller supports CMB
with SQEs, but they're not very common. If you had such a controller,
the driver will use MMIO to write commands directly into controller
memory instead of letting the controller DMA them from host memory. Do
you know if you have such a controller?
The data transfers associated with your 'dd' command will always use DMA.
> I don't know how to interpret "ranges". Can you supply the dmesg and
> "lspci -vvs 0000:05:00.0" output both ways, e.g.,
>
> pci_bus 0000:00: root bus resource [mem 0x7f800000-0xefffffff window]
> pci_bus 0000:00: root bus resource [mem 0xfd000000-0xfe7fffff window]
> pci 0000:05:00.0: [vvvv:dddd] type 00 class 0x...
> pci 0000:05:00.0: reg 0x10: [mem 0x.....000-0x.....fff ...]
>
> > Question:
> > 1. Why dd can cause nvme timeout? Is there more debug ways?
That means the nvme controller didn't provide a response to a posted
command within the driver's latency tolerance.
> > 2. How can this mmio range affect nvme timeout?
Let's see how those ranges affect what the kernel sees in the pci
topology, as Bjorn suggested.