[PATCH] Cover letter for (PCI/AER: only insert one element into kfifo)

From: Yanjiang Jin
Date: Wed Dec 12 2018 - 03:33:22 EST


Without this patch, if we have multi PCIe devices, and one of them has
AER error, aer_recover_work_func() -> kfifo_get() will traverse the whole
kfifo which has wrong element number(16).
If one null element's uninitialized memory matches another
PCIe device(0000:01:00.0), we may get the below call trace.
It is unusual, but indeed happened on my board: QDF2400.

# lspci
0000:00:00.0 PCI bridge:
0000:01:00.0 Ethernet controller:
0004:00:00.0 PCI bridge:
0004:01:00.0 Ethernet controller:
0005:00:00.0 PCI bridge:
0005:01:00.0 Ethernet controller:

Call trace:

[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 5
[Hardware Error]: It has been corrected by h/w and requires no further action
[Hardware Error]: event severity: corrected
[Hardware Error]: precise tstamp: 2018-11-29 09:23:16
[Hardware Error]: Error 0, type: corrected
[Hardware Error]: section_type: PCIe error
[Hardware Error]: port_type: 4, root port
[Hardware Error]: version: 3.0
[Hardware Error]: command: 0x0407, status: 0x0010
[Hardware Error]: device_id: 0004:00:00.0
[Hardware Error]: slot: 0
[Hardware Error]: secondary_bus: 0x01
[Hardware Error]: vendor_id: 0x17cb, device_id: 0x0401
[Hardware Error]: class_code: 000406
[Hardware Error]: bridge: secondary_status: 0x0000, control: 0x0000
AER recover: find pci_dev for 0004:00:00:0
pcieport 0004:00:00.0: aer_status: 0x00000001, aer_mask: 0x0000e000
pcieport 0004:00:00.0: [ 0] RxErr (First)
pcieport 0004:00:00.0: aer_layer=Physical Layer, aer_agent=Receiver ID
AER recover: Can not find pci_dev for a38f:00:18:2
AER recover: Can not find pci_dev for 0857:1c:03:5
AER recover: Can not find pci_dev for 62d2:80:19:6
AER recover: Can not find pci_dev for 0857:f8:03:4
AER recover: Can not find pci_dev for 0907:78:07:1
AER recover: Can not find pci_dev for 0000:00:00:1
AER recover: Can not find pci_dev for 0907:00:00:0
AER recover: Can not find pci_dev for 0000:00:00:1
AER recover: find pci_dev for 0000:01:00:0
Unable to handle kernel paging request at virtual address 0000000000813004
Mem abort info:
ESR = 0x96000007
Exception class = DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000007
CM = 0, WnR = 0
user pgtable: 64k pages, 48-bit VAs, pgdp = 000000000dce9024
[0000000000813004] pgd=0000001727260003, pud=0000001727260003
pmd=0000001727290003, pte=0000000000000000
Internal error: Oops: 96000007 [#1] SMP
Workqueue: events aer_recover_work_func
pstate: 20400005 (nzCv daif +PAN -UAO)
pc : cper_print_aer+0x4c/0x290
lr : aer_recover_work_func+0x110/0x150
sp : ffff8017ca59fca0
x29: ffff8017ca59fca0 x28: ffff8017ca841000
x27: ffff8017ca841000 x26: 0000000000000001
x25: 0000000000813000 x24: 0000000000000040
x23: 0000000000000040 x22: ffff000008d5f830
x21: ffff0000090f1f10 x20: ffff0000090f1e98
x19: 0000000000000000 x18: ffffffffffffffff
x17: 0000000000000001 x16: 0000000000000007
x15: ffff000009073708 x14: ffff0000891e8faf
x13: ffff0000091e8fbd x12: 2c726579614c206c
x11: ffff00000909b000 x10: 0000000005f5e0ff
x9 : ffff8017ca59fa10 x8 : ffff000009073978
x7 : ffff0000091e8a40 x6 : 0000000000000518
x5 : 0000000000000001 x4 : ffff8017ff9710b8
x3 : ffff8017ff9710b8 x2 : 0000000000813000
x1 : 0000000000000000 x0 : ffff000009073708
Process kworker/11:1 (pid: 232, stack limit = 0x00000000060ad7e1)
Call trace:
cper_print_aer+0x4c/0x290
aer_recover_work_func+0x110/0x150
process_one_work+0x1ac/0x3f0
worker_thread+0x54/0x430
kthread+0x104/0x130
ret_from_fork+0x10/0x18
Code: f9400001 f90057a1 d2800001 54000f40 (2940e334)
SMP: stopping secondary CPUs
Starting crashdump kernel...
Bye!



This email is intended only for the named addressee. It may contain information that is confidential/private, legally privileged, or copyright-protected, and you should handle it accordingly. If you are not the intended recipient, you do not have legal rights to retain, copy, or distribute this email or its contents, and should promptly delete the email and all electronic copies in your system; do not retain copies in any media. If you have received this email in error, please notify the sender promptly. Thank you.