RE: [EXT] Re: nvme may get timeout from dd when using different non-prefetch mmio outbound/ranges

From: Li Chen
Date: Tue Oct 26 2021 - 00:34:34 EST


Hi, Bjorn

> -----Original Message-----
> From: Bjorn Helgaas [mailto:helgaas@xxxxxxxxxx]
> Sent: Monday, October 25, 2021 11:48 PM
> To: Li Chen
> Cc: linux-pci@xxxxxxxxxxxxxxx; Lorenzo Pieralisi; Rob Herring; kw@xxxxxxxxx;
> Bjorn Helgaas; linux-kernel@xxxxxxxxxxxxxxx; Tom Joseph; Keith Busch; Jens
> Axboe; Christoph Hellwig; Sagi Grimberg; linux-nvme@xxxxxxxxxxxxxxxxxxx
> Subject: [EXT] Re: nvme may get timeout from dd when using different non-
> prefetch mmio outbound/ranges
>
> [+cc Tom (Cadence maintainer), NVMe folks]
>
> On Fri, Oct 22, 2021 at 10:08:20AM +0000, Li Chen wrote:
> > Hi, all
> >
> > I found my nvme may get timeout with simply dd, the pcie controller
> > is cadence, and the pcie-controller platform I use is
> > pcie-cadence-host.c, kenrel version is 5.10.32:
>
> Given that nvme is pretty widely used, I tend to suspect the cadence
> driver or DTS here, but I added both cadence and nvme folks to the CC:
> list.

Thanks!

>
> > # dd if=/dev/urandom of=/dev/nvme0n1 bs=4k count=1024000
> > [ 41.913484][ T274] urandom_read: 2 callbacks suppressed
> > [ 41.913490][ T274] random: dd: uninitialized urandom read (4096 bytes read)
> > [ 41.926130][ T274] random: dd: uninitialized urandom read (4096 bytes read)
> > [ 41.933348][ T274] random: dd: uninitialized urandom read (4096 bytes read)
> > [ 47.651842][ C0] random: crng init done
> > [ 47.655963][ C0] random: 2 urandom warning(s) missed due to ratelimiting
> > [ 81.448128][ T64] nvme nvme0: controller is down; will reset: CSTS=0x3,
> PCI_STATUS=0x2010
> > [ 81.481139][ T7] nvme 0000:05:00.0: enabling bus mastering
> > [ 81.486946][ T7] nvme 0000:05:00.0: saving config space at offset 0x0
> (reading 0xa809144d)
> > [ 81.495517][ T7] nvme 0000:05:00.0: saving config space at offset 0x4
> (reading 0x20100006)
> > [ 81.504091][ T7] nvme 0000:05:00.0: saving config space at offset 0x8
> (reading 0x1080200)
> > [ 81.512571][ T7] nvme 0000:05:00.0: saving config space at offset 0xc
> (reading 0x0)
> > [ 81.520527][ T7] nvme 0000:05:00.0: saving config space at offset 0x10
> (reading 0x8000004)
> > [ 81.529094][ T7] nvme 0000:05:00.0: saving config space at offset 0x14
> (reading 0x0)
> > [ 81.537138][ T7] nvme 0000:05:00.0: saving config space at offset 0x18
> (reading 0x0)
> > [ 81.545186][ T7] nvme 0000:05:00.0: saving config space at offset 0x1c
> (reading 0x0)
> > [ 81.553252][ T7] nvme 0000:05:00.0: saving config space at offset 0x20
> (reading 0x0)
> > [ 81.561296][ T7] nvme 0000:05:00.0: saving config space at offset 0x24
> (reading 0x0)
> > [ 81.569340][ T7] nvme 0000:05:00.0: saving config space at offset 0x28
> (reading 0x0)
> > [ 81.577384][ T7] nvme 0000:05:00.0: saving config space at offset 0x2c
> (reading 0xa801144d)
> > [ 81.586038][ T7] nvme 0000:05:00.0: saving config space at offset 0x30
> (reading 0x0)
> > [ 81.594081][ T7] nvme 0000:05:00.0: saving config space at offset 0x34
> (reading 0x40)
> > [ 81.602217][ T7] nvme 0000:05:00.0: saving config space at offset 0x38
> (reading 0x0)
> > [ 81.610266][ T7] nvme 0000:05:00.0: saving config space at offset 0x3c
> (reading 0x12c)
> > [ 81.634065][ T7] nvme nvme0: Shutdown timeout set to 8 seconds
> > [ 81.674332][ T7] nvme nvme0: 1/0/0 default/read/poll queues
> > [ 112.168136][ T256] nvme nvme0: I/O 12 QID 1 timeout, disable controller
> > [ 112.193145][ T256] blk_update_request: I/O error, dev nvme0n1, sector
> 600656 op 0x1:(WRITE) flags 0x104000 phys_seg 127 prio class 0
> > [ 112.205220][ T256] Buffer I/O error on dev nvme0n1, logical block 75082, lost
> async page write
> > [ 112.213978][ T256] Buffer I/O error on dev nvme0n1, logical block 75083, lost
> async page write
> > [ 112.222727][ T256] Buffer I/O error on dev nvme0n1, logical block 75084, lost
> async page write
> > [ 112.231474][ T256] Buffer I/O error on dev nvme0n1, logical block 75085, lost
> async page write
> > [ 112.240220][ T256] Buffer I/O error on dev nvme0n1, logical block 75086, lost
> async page write
> > [ 112.248966][ T256] Buffer I/O error on dev nvme0n1, logical block 75087, lost
> async page write
> > [ 112.257719][ T256] Buffer I/O error on dev nvme0n1, logical block 75088, lost
> async page write
> > [ 112.266467][ T256] Buffer I/O error on dev nvme0n1, logical block 75089, lost
> async page write
> > [ 112.275213][ T256] Buffer I/O error on dev nvme0n1, logical block 75090, lost
> async page write
> > [ 112.283959][ T256] Buffer I/O error on dev nvme0n1, logical block 75091, lost
> async page write
> > [ 112.293554][ T256] blk_update_request: I/O error, dev nvme0n1, sector
> 601672 op 0x1:(WRITE) flags 0x104000 phys_seg 127 prio class 0
> > [ 112.306559][ T256] blk_update_request: I/O error, dev nvme0n1, sector
> 602688 op 0x1:(WRITE) flags 0x104000 phys_seg 127 prio class 0
> > [ 112.319525][ T256] blk_update_request: I/O error, dev nvme0n1, sector
> 603704 op 0x1:(WRITE) flags 0x104000 phys_seg 127 prio class 0
> > [ 112.332501][ T256] blk_update_request: I/O error, dev nvme0n1, sector
> 604720 op 0x1:(WRITE) flags 0x104000 phys_seg 127 prio class 0
> > [ 112.345466][ T256] blk_update_request: I/O error, dev nvme0n1, sector
> 605736 op 0x1:(WRITE) flags 0x104000 phys_seg 127 prio class 0
> > [ 112.358427][ T256] blk_update_request: I/O error, dev nvme0n1, sector
> 606752 op 0x1:(WRITE) flags 0x104000 phys_seg 127 prio class 0
> > [ 112.371386][ T256] blk_update_request: I/O error, dev nvme0n1, sector
> 607768 op 0x1:(WRITE) flags 0x104000 phys_seg 127 prio class 0
> > [ 112.384346][ T256] blk_update_request: I/O error, dev nvme0n1, sector
> 608784 op 0x1:(WRITE) flags 0x104000 phys_seg 127 prio class 0
> > [ 112.397315][ T256] blk_update_request: I/O error, dev nvme0n1, sector
> 609800 op 0x1:(WRITE) flags 0x104000 phys_seg 127 prio class 0
> > [ 112.459313][ T7] nvme nvme0: failed to mark controller live state
> > [ 112.465758][ T7] nvme nvme0: Removing after probe failure status: -19
> > dd: error writing '/dev/nvme0n1': No space left on device
> > 112200+0 records in
> > 112199+0 records out
> > 459567104 bytes (438.3MB) copied, 70.573266 seconds, 6.2MB/s
> > # [ 112.935768][ T7] nvme nvme0: failed to set APST feature (-19)
> >
> >
> >
> >
> > Here is the dts I used:
> > pciec: pcie-controller@2040000000 {
> > compatible = "cdns,cdns-pcie-host";
> > device_type = "pci";
> > #address-cells = <3>;
> > #size-cells = <2>;
> > bus-range = <0 5>;
> > linux,pci-domain = <0>;
> > cdns,no-bar-match-nbits = <38>;
> > vendor-id = <0x17cd>;
> > device-id = <0x0100>;
> > reg-names = "reg", "cfg";
> > reg = <0x20 0x40000000 0x0 0x10000000>,
> > <0x20 0x00000000 0x0 0x00001000>; /* RC only */
> >
> > /*
> > * type: 0x00000000 cfg space
> > * type: 0x01000000 IO
> > * type: 0x02000000 32bit mem space No prefetch
> > * type: 0x03000000 64bit mem space No prefetch
> > * type: 0x43000000 64bit mem space prefetch
> > * The First 16MB from BUS_DEV_FUNC=0:0:0 for cfg space
> > * <0x00000000 0x00 0x00000000 0x20 0x00000000 0x00
> 0x01000000>, CFG_SPACE
> > */
> > ranges = <0x01000000 0x00 0x00000000 0x20 0x00100000 0x00
> 0x00100000>,
> > <0x02000000 0x00 0x08000000 0x20 0x08000000 0x00
> 0x08000000>;
> >
> > #interrupt-cells = <0x1>;
> > interrupt-map-mask = <0x00 0x0 0x0 0x7>;
> > interrupt-map = <0x0 0x0 0x0 0x1 &gic 0 229 0x4>,
> > <0x0 0x0 0x0 0x2 &gic 0 230 0x4>,
> > <0x0 0x0 0x0 0x3 &gic 0 231 0x4>,
> > <0x0 0x0 0x0 0x4 &gic 0 232 0x4>;
> > phys = <&pcie_phy>;
> > phy-names="pcie-phy";
> > status = "ok";
> > };
> >
> >
> > After some digging, I find if I change the controller's range
> > property from
> >
> > <0x02000000 0x00 0x08000000 0x20 0x08000000 0x00 0x08000000> into
> > <0x02000000 0x00 0x00400000 0x20 0x00400000 0x00 0x08000000>,
> >
> > then dd will success without timeout. IIUC, range here
> > is only for non-prefetch 32bit mmio, but dd will use dma (maybe cpu
> > will send cmd to nvme controller via mmio?).
>
> I don't know how to interpret "ranges". Can you supply the dmesg and
> "lspci -vvs 0000:05:00.0" output both ways, e.g.,
>
> pci_bus 0000:00: root bus resource [mem 0x7f800000-0xefffffff window]
> pci_bus 0000:00: root bus resource [mem 0xfd000000-0xfe7fffff window]
> pci 0000:05:00.0: [vvvv:dddd] type 00 class 0x...
> pci 0000:05:00.0: reg 0x10: [mem 0x.....000-0x.....fff ...]
>

From the source code:

static int devm_of_pci_get_host_bridge_resources
{
...
err = of_pci_range_parser_init(&parser, dev_node);
...
for_each_of_pci_range(&parser, &range) {
/* Read next ranges element */
if ((range.flags & IORESOURCE_TYPE_BITS) == IORESOURCE_IO)
range_type = "IO";
else if ((range.flags & IORESOURCE_TYPE_BITS) == IORESOURCE_MEM)
range_type = "MEM";
else
range_type = "err";
dev_info(dev, " %6s %#012llx..%#012llx -> %#012llx\n",
range_type, range.cpu_addr,
range.cpu_addr + range.size - 1, range.pci_addr);

/*
* If we failed translation or got a zero-sized region
* then skip this range
*/
if (range.cpu_addr == OF_BAD_ADDR || range.size == 0)
continue;

err = of_pci_range_to_resource(&range, dev_node, &tmp_res);
...
}

So, it simply translates mmio mapping.



Take
ranges = <0x01000000 0x00 0x00000000 0x20 0x00100000 0x00 0x00100000>,
<0x02000000 0x00 0x08000000 0x20 0x08000000 0x00 0x04000000>;
for example (this ranges will make nvme timeout):

lspci output:
# lspci -i /usr/share/misc/pci.ids -vvs 0000:05:00.0
05:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller 980 (prog-if 02 [NVM Express])
Subsystem: Samsung Electronics Co Ltd Device a801
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 44
Region 0: Memory at 2008000000 (64-bit, non-prefetchable) [size=16K]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [50] MSI: Enable- Count=1/32 Maskable- 64bit+
Address: 0000000000000000 Data: 0000
Capabilities: [70] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 25.000W
DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L1, Exit Latency L1 <64us
ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 8GT/s (ok), Width x4 (ok)
TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR+
10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-
EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
FRS- TPHComp- ExtTPHComp-
AtomicOpsCap: 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled,
AtomicOpsCtl: ReqEn-
LnkCap2: Supported Link Speeds: 2.5-8GT/s, Crosslink- Retimer- 2Retimers- DRS-
LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+ EqualizationPhase1+
EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
Retimer- 2Retimers- CrosslinkRes: unsupported
Capabilities: [b0] MSI-X: Enable- Count=13 Masked-
Vector table: BAR=0 offset=00003000
PBA: BAR=0 offset=00002000
Capabilities: [100 v2] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
MultHdrRecCap+ MultHdrRecEn- TLPPfxPres- HdrLogCap-
HeaderLog: 00000000 00000000 00000000 00000000
Capabilities: [148 v1] Device Serial Number 00-00-00-00-00-00-00-00
Capabilities: [158 v1] Power Budgeting <?>
Capabilities: [168 v1] Secondary PCI Express
LnkCtl3: LnkEquIntrruptEn- PerformEqu-
LaneErrStat: 0
Capabilities: [188 v1] Latency Tolerance Reporting
Max snoop latency: 0ns
Max no snoop latency: 0ns
Capabilities: [190 v1] L1 PM Substates
L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
PortCommonModeRestoreTime=10us PortTPowerOnTime=10us
L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
T_CommonMode=0us LTR1.2_Threshold=0ns
L1SubCtl2: T_PwrOn=10us
Kernel driver in use: nvme
Kernel modules: nvme




/proc/iomem:

01200000-ffffffff : System RAM
01280000-022affff : Kernel code
022b0000-0295ffff : reserved
02960000-040cffff : Kernel data
05280000-0528ffff : reserved
41cc0000-422c0fff : reserved
422c1000-4232afff : reserved
4232d000-667bbfff : reserved
667bc000-667bcfff : reserved
667bd000-667c0fff : reserved
667c1000-ffffffff : reserved
2000000000-2000000fff : cfg
2008000000-200bffffff : pcie-controller@2040000000
2008000000-20081fffff : PCI Bus 0000:01
2008000000-20080fffff : PCI Bus 0000:02
2008000000-20080fffff : PCI Bus 0000:05
2008000000-2008003fff : 0000:05:00.0
2008000000-2008003fff : nvme
2008100000-200817ffff : 0000:01:00.0
...

From dmesg output:

# dmesg | grep pci
[ 0.000000] Kernel command line: console=ttyS0 ubi.mtd=lnx root=ubi0:rootfs rw rootfstype=ubifs init=/linuxrc pci=nomsi pcie_pme=nomsi nvme_core.default_ps_max_latency_us=0 pcie_aspm=off pci=nommconf nvme_core.default_ps_max_latency_us=0
[ 2.431732] cdns-pcie 2040000000.pcie-controller: host bridge /pcie-controller@2040000000 ranges:
[ 2.442055] cdns-pcie 2040000000.pcie-controller: Parsing ranges property...
[ 2.450010] cdns-pcie 2040000000.pcie-controller: IO 0x2000100000..0x20001fffff -> 0x0000000000
[ 2.460215] cdns-pcie 2040000000.pcie-controller: MEM 0x2008000000..0x200bffffff -> 0x0008000000
[ 2.523271] cdns-pcie 2040000000.pcie-controller: Link up
[ 2.530647] cdns-pcie 2040000000.pcie-controller: PCI host bridge to bus 0000:00
[ 2.538885] pci_bus 0000:00: root bus resource [bus 00-05]
[ 2.545194] pci_bus 0000:00: root bus resource [io 0x0000-0xfffff]
[ 2.552280] pci_bus 0000:00: root bus resource [mem 0x2008000000-0x200bffffff] (bus address [0x08000000-0x0bffffff])
[ 2.563651] pci_bus 0000:00: scanning bus
[ 2.568753] pci 0000:00:00.0: [17cd:0100] type 01 class 0x060400
[ 2.576196] pci 0000:00:00.0: supports D1
[ 2.581006] pci 0000:00:00.0: PME# supported from D0 D1 D3hot
[ 2.587667] pci 0000:00:00.0: PME# disabled
[ 2.597799] pci_bus 0000:00: fixups for bus
[ 2.602800] pci 0000:00:00.0: scanning [bus 00-00] behind bridge, pass 0
[ 2.610308] pci 0000:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[ 2.619160] pci 0000:00:00.0: scanning [bus 00-00] behind bridge, pass 1
[ 2.627525] pci_bus 0000:01: scanning bus
[ 2.632550] pci 0000:01:00.0: [12d8:c016] type 01 class 0x060400
[ 2.639425] pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x0007ffff]
[ 2.647250] pci 0000:01:00.0: PME# supported from D0 D3hot D3cold
[ 2.654165] pci 0000:01:00.0: PME# disabled
[ 2.664011] pci_bus 0000:01: fixups for bus
[ 2.669012] pci 0000:01:00.0: scanning [bus 00-00] behind bridge, pass 0
[ 2.676521] pci 0000:01:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[ 2.685370] pci 0000:01:00.0: scanning [bus 00-00] behind bridge, pass 1
[ 2.693768] pci_bus 0000:02: scanning bus
[ 2.699101] pci 0000:02:02.0: [12d8:c016] type 01 class 0x060400
[ 2.706740] pci 0000:02:02.0: PME# supported from D0 D3hot D3cold
[ 2.713653] pci 0000:02:02.0: PME# disabled
[ 2.720417] pci 0000:02:04.0: [12d8:c016] type 01 class 0x060400
[ 2.728032] pci 0000:02:04.0: PME# supported from D0 D3hot D3cold
[ 2.734945] pci 0000:02:04.0: PME# disabled
[ 2.741684] pci 0000:02:06.0: [12d8:c016] type 01 class 0x060400
[ 2.749288] pci 0000:02:06.0: PME# supported from D0 D3hot D3cold
[ 2.756201] pci 0000:02:06.0: PME# disabled
[ 2.765483] pci_bus 0000:02: fixups for bus
[ 2.770482] pci 0000:02:02.0: scanning [bus 00-00] behind bridge, pass 0
[ 2.777991] pci 0000:02:02.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[ 2.786839] pci 0000:02:04.0: scanning [bus 00-00] behind bridge, pass 0
[ 2.794337] pci 0000:02:04.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[ 2.803184] pci 0000:02:06.0: scanning [bus 00-00] behind bridge, pass 0
[ 2.810691] pci 0000:02:06.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[ 2.819539] pci 0000:02:02.0: scanning [bus 00-00] behind bridge, pass 1
[ 2.827948] pci_bus 0000:03: scanning bus
[ 2.836100] pci_bus 0000:03: fixups for bus
[ 2.841085] pci_bus 0000:03: bus scan returning with max=03
[ 2.847467] pci_bus 0000:03: busn_res: [bus 03-05] end is updated to 03
[ 2.854911] pci 0000:02:04.0: scanning [bus 00-00] behind bridge, pass 1
[ 2.863151] pci_bus 0000:04: scanning bus
[ 2.871319] pci_bus 0000:04: fixups for bus
[ 2.876303] pci_bus 0000:04: bus scan returning with max=04
[ 2.882683] pci_bus 0000:04: busn_res: [bus 04-05] end is updated to 04
[ 2.890151] pci 0000:02:06.0: scanning [bus 00-00] behind bridge, pass 1
[ 2.898383] pci_bus 0000:05: scanning bus
[ 2.903362] pci 0000:05:00.0: [144d:a809] type 00 class 0x010802
[ 2.910266] pci 0000:05:00.0: reg 0x10: [mem 0x00000000-0x00003fff 64bit]
[ 2.923188] pci_bus 0000:05: fixups for bus
[ 2.928174] pci_bus 0000:05: bus scan returning with max=05
[ 2.934553] pci_bus 0000:05: busn_res: [bus 05] end is updated to 05
[ 2.941724] pci_bus 0000:02: bus scan returning with max=05
[ 2.948101] pci_bus 0000:02: busn_res: [bus 02-05] end is updated to 05
[ 2.955557] pci_bus 0000:01: bus scan returning with max=05
[ 2.961983] pci_bus 0000:01: busn_res: [bus 01-05] end is updated to 05
[ 2.969413] pci_bus 0000:00: bus scan returning with max=05
[ 2.975843] pci 0000:00:00.0: BAR 8: assigned [mem 0x2008000000-0x20081fffff]
[ 2.983843] pci 0000:01:00.0: BAR 8: assigned [mem 0x2008000000-0x20080fffff]
[ 2.991795] pci 0000:01:00.0: BAR 0: assigned [mem 0x2008100000-0x200817ffff]
[ 2.999801] pci 0000:02:06.0: BAR 8: assigned [mem 0x2008000000-0x20080fffff]
[ 3.007763] pci 0000:02:02.0: PCI bridge to [bus 03]
[ 3.013622] pci 0000:02:04.0: PCI bridge to [bus 04]
[ 3.019487] pci 0000:05:00.0: BAR 0: assigned [mem 0x2008000000-0x2008003fff 64bit]
[ 3.028012] pci 0000:02:06.0: PCI bridge to [bus 05]
[ 3.033792] pci 0000:02:06.0: bridge window [mem 0x2008000000-0x20080fffff]
[ 3.041767] pci 0000:01:00.0: PCI bridge to [bus 02-05]
[ 3.047808] pci 0000:01:00.0: bridge window [mem 0x2008000000-0x20080fffff]
[ 3.055782] pci 0000:00:00.0: PCI bridge to [bus 01-05]
[ 3.061822] pci 0000:00:00.0: bridge window [mem 0x2008000000-0x20081fffff]
[ 4.184289] pcie_pme=nomsi


Also take
ranges = <0x01000000 0x00 0x00000000 0x20 0x00100000 0x00 0x00100000>,
<0x02000000 0x00 0x08000000 0x20 0x08000000 0x00 0x04000000>;
for example (this ranges will not make nvme timeout):

lspci:
# lspci -i /usr/share/misc/pci.ids -vvs 0000:05:00.0
05:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller 980 (prog-if 02 [NVM Express])
Subsystem: Samsung Electronics Co Ltd Device a801
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 44
Region 0: Memory at 2000400000 (64-bit, non-prefetchable) [size=16K]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [50] MSI: Enable- Count=1/32 Maskable- 64bit+
Address: 0000000000000000 Data: 0000
Capabilities: [70] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 25.000W
DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L1, Exit Latency L1 <64us
ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 8GT/s (ok), Width x4 (ok)
TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR+
10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-
EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
FRS- TPHComp- ExtTPHComp-
AtomicOpsCap: 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled,
AtomicOpsCtl: ReqEn-
LnkCap2: Supported Link Speeds: 2.5-8GT/s, Crosslink- Retimer- 2Retimers- DRS-
LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+ EqualizationPhase1+
EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
Retimer- 2Retimers- CrosslinkRes: unsupported
Capabilities: [b0] MSI-X: Enable- Count=13 Masked-
Vector table: BAR=0 offset=00003000
PBA: BAR=0 offset=00002000
Capabilities: [100 v2] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
MultHdrRecCap+ MultHdrRecEn- TLPPfxPres- HdrLogCap-
HeaderLog: 00000000 00000000 00000000 00000000
Capabilities: [148 v1] Device Serial Number 00-00-00-00-00-00-00-00
Capabilities: [158 v1] Power Budgeting <?>
Capabilities: [168 v1] Secondary PCI Express
LnkCtl3: LnkEquIntrruptEn- PerformEqu-
LaneErrStat: 0
Capabilities: [188 v1] Latency Tolerance Reporting
Max snoop latency: 0ns
Max no snoop latency: 0ns
Capabilities: [190 v1] L1 PM Substates
L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
PortCommonModeRestoreTime=10us PortTPowerOnTime=10us
L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
T_CommonMode=0us LTR1.2_Threshold=0ns
L1SubCtl2: T_PwrOn=10us
Kernel driver in use: nvme
Kernel modules: nvme

/proc/iomem:
01200000-ffffffff : System RAM
01280000-022affff : Kernel code
022b0000-0295ffff : reserved
02960000-040cffff : Kernel data
05280000-0528ffff : reserved
41cc0000-422c0fff : reserved
422c1000-4232afff : reserved
4232d000-667bbfff : reserved
667bc000-667bcfff : reserved
667bd000-667c0fff : reserved
667c1000-ffffffff : reserved
2000000000-2000000fff : cfg
2000400000-20083fffff : pcie-controller@2040000000
2000400000-20005fffff : PCI Bus 0000:01
2000400000-20004fffff : PCI Bus 0000:02
2000400000-20004fffff : PCI Bus 0000:05
2000400000-2000403fff : 0000:05:00.0
2000500000-200057ffff : 0000:01:00.0
...

Dmesg:
# dmesg | grep pci
[ 0.000000] Kernel command line: console=ttyS0 ubi.mtd=lnx root=ubi0:rootfs rw rootfstype=ubifs init=/linuxrc pci=nomsi pcie_pme=nomsi nvme_core.default_ps_max_latency_us=0 pcie_aspm=off pci=nommconf nvme_core.default_ps_max_latency_us=0
[ 2.436357] cdns-pcie 2040000000.pcie-controller: host bridge /pcie-controller@2040000000 ranges:
[ 2.446129] cdns-pcie 2040000000.pcie-controller: Parsing ranges property...
[ 2.454141] cdns-pcie 2040000000.pcie-controller: IO 0x2000100000..0x20001fffff -> 0x0000000000
[ 2.464350] cdns-pcie 2040000000.pcie-controller: MEM 0x2000400000..0x20083fffff -> 0x0000400000
[ 2.527395] cdns-pcie 2040000000.pcie-controller: Link up
[ 2.534745] cdns-pcie 2040000000.pcie-controller: PCI host bridge to bus 0000:00
[ 2.543036] pci_bus 0000:00: root bus resource [bus 00-05]
[ 2.549345] pci_bus 0000:00: root bus resource [io 0x0000-0xfffff]
[ 2.556431] pci_bus 0000:00: root bus resource [mem 0x2000400000-0x20083fffff] (bus address [0x00400000-0x083fffff])
[ 2.567802] pci_bus 0000:00: scanning bus
[ 2.572823] pci 0000:00:00.0: [17cd:0100] type 01 class 0x060400
[ 2.580290] pci 0000:00:00.0: supports D1
[ 2.585099] pci 0000:00:00.0: PME# supported from D0 D1 D3hot
[ 2.591709] pci 0000:00:00.0: PME# disabled
[ 2.601802] pci_bus 0000:00: fixups for bus
[ 2.606801] pci 0000:00:00.0: scanning [bus 00-00] behind bridge, pass 0
[ 2.614310] pci 0000:00:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[ 2.623161] pci 0000:00:00.0: scanning [bus 00-00] behind bridge, pass 1
[ 2.631551] pci_bus 0000:01: scanning bus
[ 2.636563] pci 0000:01:00.0: [12d8:c016] type 01 class 0x060400
[ 2.643460] pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x0007ffff]
[ 2.651289] pci 0000:01:00.0: PME# supported from D0 D3hot D3cold
[ 2.658202] pci 0000:01:00.0: PME# disabled
[ 2.667966] pci_bus 0000:01: fixups for bus
[ 2.673047] pci 0000:01:00.0: scanning [bus 00-00] behind bridge, pass 0
[ 2.680556] pci 0000:01:00.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[ 2.689405] pci 0000:01:00.0: scanning [bus 00-00] behind bridge, pass 1
[ 2.697804] pci_bus 0000:02: scanning bus
[ 2.703079] pci 0000:02:02.0: [12d8:c016] type 01 class 0x060400
[ 2.710742] pci 0000:02:02.0: PME# supported from D0 D3hot D3cold
[ 2.717657] pci 0000:02:02.0: PME# disabled
[ 2.724466] pci 0000:02:04.0: [12d8:c016] type 01 class 0x060400
[ 2.732078] pci 0000:02:04.0: PME# supported from D0 D3hot D3cold
[ 2.739037] pci 0000:02:04.0: PME# disabled
[ 2.745734] pci 0000:02:06.0: [12d8:c016] type 01 class 0x060400
[ 2.753344] pci 0000:02:06.0: PME# supported from D0 D3hot D3cold
[ 2.760257] pci 0000:02:06.0: PME# disabled
[ 2.769510] pci_bus 0000:02: fixups for bus
[ 2.774532] pci 0000:02:02.0: scanning [bus 00-00] behind bridge, pass 0
[ 2.782043] pci 0000:02:02.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[ 2.790892] pci 0000:02:04.0: scanning [bus 00-00] behind bridge, pass 0
[ 2.798494] pci 0000:02:04.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[ 2.807347] pci 0000:02:06.0: scanning [bus 00-00] behind bridge, pass 0
[ 2.814855] pci 0000:02:06.0: bridge configuration invalid ([bus 00-00]), reconfiguring
[ 2.823701] pci 0000:02:02.0: scanning [bus 00-00] behind bridge, pass 1
[ 2.832098] pci_bus 0000:03: scanning bus
[ 2.840343] pci_bus 0000:03: fixups for bus
[ 2.845328] pci_bus 0000:03: bus scan returning with max=03
[ 2.851708] pci_bus 0000:03: busn_res: [bus 03-05] end is updated to 03
[ 2.859152] pci 0000:02:04.0: scanning [bus 00-00] behind bridge, pass 1
[ 2.867404] pci_bus 0000:04: scanning bus
[ 2.875551] pci_bus 0000:04: fixups for bus
[ 2.880535] pci_bus 0000:04: bus scan returning with max=04
[ 2.886914] pci_bus 0000:04: busn_res: [bus 04-05] end is updated to 04
[ 2.894359] pci 0000:02:06.0: scanning [bus 00-00] behind bridge, pass 1
[ 2.902637] pci_bus 0000:05: scanning bus
[ 2.907632] pci 0000:05:00.0: [144d:a809] type 00 class 0x010802
[ 2.914536] pci 0000:05:00.0: reg 0x10: [mem 0x00000000-0x00003fff 64bit]
[ 2.927440] pci_bus 0000:05: fixups for bus
[ 2.932426] pci_bus 0000:05: bus scan returning with max=05
[ 2.938807] pci_bus 0000:05: busn_res: [bus 05] end is updated to 05
[ 2.946025] pci_bus 0000:02: bus scan returning with max=05
[ 2.952403] pci_bus 0000:02: busn_res: [bus 02-05] end is updated to 05
[ 2.959834] pci_bus 0000:01: bus scan returning with max=05
[ 2.966233] pci_bus 0000:01: busn_res: [bus 01-05] end is updated to 05
[ 2.973666] pci_bus 0000:00: bus scan returning with max=05
[ 2.980097] pci 0000:00:00.0: BAR 8: assigned [mem 0x2000400000-0x20005fffff]
[ 2.988097] pci 0000:01:00.0: BAR 8: assigned [mem 0x2000400000-0x20004fffff]
[ 2.996046] pci 0000:01:00.0: BAR 0: assigned [mem 0x2000500000-0x200057ffff]
[ 3.004092] pci 0000:02:06.0: BAR 8: assigned [mem 0x2000400000-0x20004fffff]
[ 3.012055] pci 0000:02:02.0: PCI bridge to [bus 03]
[ 3.017867] pci 0000:02:04.0: PCI bridge to [bus 04]
[ 3.023709] pci 0000:05:00.0: BAR 0: assigned [mem 0x2000400000-0x2000403fff 64bit]
[ 3.032253] pci 0000:02:06.0: PCI bridge to [bus 05]
[ 3.038071] pci 0000:02:06.0: bridge window [mem 0x2000400000-0x20004fffff]
[ 3.046086] pci 0000:01:00.0: PCI bridge to [bus 02-05]
[ 3.052127] pci 0000:01:00.0: bridge window [mem 0x2000400000-0x20004fffff]
[ 3.060100] pci 0000:00:00.0: PCI bridge to [bus 01-05]
[ 3.066140] pci 0000:00:00.0: bridge window [mem 0x2000400000-0x20005fffff]
[ 4.181033] pcie_pme=nomsi
[ 44.419264] nvme nvme0: pci function 0000:05:00.0
[ 44.425541] pci 0000:00:00.0: enabling device (0000 -> 0002)
[ 44.432217] pci 0000:00:00.0: enabling bus mastering
[ 44.438183] pci 0000:01:00.0: enabling device (0000 -> 0002)
[ 44.444720] pci 0000:01:00.0: enabling bus mastering
[ 44.453373] pci 0000:02:06.0: enabling device (0000 -> 0002)
[ 44.459951] pci 0000:02:06.0: enabling bus mastering


[firstlove@sh-lchen tmp]$ diff no-timeout.lspci timeout.lspci
8c8
< Region 0: Memory at 2000400000 (64-bit, non-prefetchable) [size=16K]
---
> Region 0: Memory at 2008000000 (64-bit, non-prefetchable) [size=16K]

> > Question:
> > 1. Why dd can cause nvme timeout? Is there more debug ways?
> > 2. How can this mmio range affect nvme timeout?
> >
> > Regards,
> > Li
> >
> >
> **************************************************************
> ********
> > This email and attachments contain Ambarella Proprietary and/or Confidential
> Information and is intended solely for the use of the individual(s) to whom it is
> addressed. Any unauthorized review, use, disclosure, distribute, copy, or print is
> prohibited. If you are not an intended recipient, please contact the sender by
> reply email and destroy all copies of the original message. Thank you.
>
> ##############################################################
> ########
> This EXTERNAL email has been scanned by Proofpoint Email Protect service.

Regards,
Li