Re: ath9k_htc - Division by zero in kernel (as well as firmware panic)

From: Tobias Diedrich
Date: Tue Jun 06 2017 - 20:19:22 EST


Oleksij Rempel wrote:
> Yes, this is "normal" problem. The firmware has no error handler for PCI
> bus related exceptions. So if we filed to read PCI bus first time, we
> have choice to Ooops and stall or Ooops and reboot ASAP. So we reboot
> and provide an kernel "firmware panic!" message.
> Every one who can or will to fix this, is welcome.
>
> > *****
> > Jun 02 14:55:30 computer kernel: usb 1-1.1: ath: firmware panic!
> > exccause: 0x0000000d; pc: 0x0090ae81; badvaddr: 0x10ff4038.
[...]

>memdmp 50ae78 50ae88

50ae78: 6c10 0412 6aa2 0c02 0088 20c0 2008 1940 l...j..........@

[...copy to bin...]
$ bin/objdump -b binary -m xtensa -D /tmp/memdump.bin
[..]
0: 6c1004 entry a1, 32
3: 126aa2 l32r a2, 0xfffdaa8c
6: 0c0200 memw
9: 8820 l32i.n a8, a2, 0 <----------Exception cause PC still points at load
b: c020 movi.n a2, 0
d: 081940 extui a9, a8, 1, 1

Judging from that it should be fairly simple to at least implement
some sort of retry, possible after triggering a PCIe link retrain?
There are some related PCIe root complex registers that may point to
what exactly failed if they were dumped.

The root complex registers live at 0x00040000 and I think match the
registers described for the root complex in the AR9344 datasheet.

PCIE_INT_MASK would map to 0x40050 and has a bit for SYS_ERR:
"A system error. The RC Core asserts CFG_SYS_ERR_RC if any device in
the hierarchy reports any of the following errors and the associated
enable bit is set in the Root Control register: ERR_COR, ERR_FATAL,
ERR_NONFATAL."

AFAICS link retrain can be done by setting bit3 (INIT_RST,
"Application request to initiate a training reset") in
PCIE_APP (0x40000).

See sboot/magpie_1_1/sboot/cmnos/eeprom/src/cmnos_eeprom.c (which
flips some bits in the RC to enable the PCIe bus for reading the
EEPROM).

The root complex pci configuration space is at 0x20000 which could
have further error details:
>memdmp 20000 20200

020000: a02a 168c 0010 0006 0000 0001 0001 0000 .*..............
020010: 0000 0000 0000 0000 0000 0000 0000 0000 ................
020020: 0000 0000 0000 0000 0000 0000 0000 0000 ................
020030: 0000 0000 0000 0040 0000 0000 0000 01ff .......@........
020040: 5bc3 5001 0000 0000 0000 0000 0000 0000 [.P.............
020050: 0080 7005 0000 0000 0000 0000 0000 0000 ..p.............
020060: 0000 0000 0000 0000 0000 0000 0000 0000 ................
020070: 0042 0010 0000 8701 0000 2010 0013 4411 .B............D.
020080: 3011 0000 0000 0000 00c0 03c0 0000 0000 0...............
020090: 0000 0000 0000 0010 0000 0000 0000 0000 ................
0200a0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
0200b0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
0200c0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
0200d0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
0200e0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
0200f0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
020100: 1401 0001 0000 0000 0000 0000 0006 2030 ...............0
020110: 0000 0000 0000 2000 0000 00a0 0000 0000 ................
020120: 0000 0000 0000 0000 0000 0000 0000 0000 ................
020130: 0000 0000 0000 0000 0000 0000 0000 0000 ................
020140: 0001 0002 0000 0000 0000 0000 0000 0000 ................
020150: 0000 0000 8000 00ff 0000 0000 0000 0000 ................
020160: 0000 0000 0000 0000 0000 0000 0000 0000 ................
020170: 0000 0000 0000 0000 0000 0000 0000 0000 ................
020180: 0000 0000 0000 0000 0000 0000 0000 0000 ................
020190: 0000 0000 0000 0000 0000 0000 0000 0000 ................
0201a0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
0201b0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
0201c0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
0201d0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
0201e0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
0201f0: 0000 0000 0000 0000 0000 0000 0000 0000 ................

Transformed into something suitable for feeding into lspci -F:

00:00.0 Description filled in by lspci
00: 8c 16 2a a0 06 00 10 00 01 00 00 00 00 00 01 00
10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
30: 00 00 00 00 40 00 00 00 00 00 00 00 ff 01 00 00
40: 01 50 c3 5b 00 00 00 00 00 00 00 00 00 00 00 00
50: 05 70 80 00 00 00 00 00 00 00 00 00 00 00 00 00
60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
70: 10 00 42 00 01 87 00 00 10 20 00 00 11 44 13 00
80: 00 00 11 30 00 00 00 00 c0 03 c0 00 00 00 00 00
90: 00 00 00 00 10 00 00 00 00 00 00 00 00 00 00 00
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

$ lspci -F /tmp/hexdump -vvv
00:00.0 Non-VGA unclassified device: Qualcomm Atheros Device a02a (rev 01)
!!! Invalid class 0000 for header type 01
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 255
Bus: primary=00, secondary=00, subordinate=00, sec-latency=0
I/O behind bridge: 00000000-00000fff
Memory behind bridge: 00000000-000fffff
Prefetchable memory behind bridge: 00000000-000fffff
Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1+ D2- AuxCurrent=375mA PME(D0+,D1+,D2-,D3hot+,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
Address: 0000000000000000 Data: 0000
Capabilities: [70] Express (v2) Root Port (Slot-), MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0
ExtTag- RBE+
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s, Exit Latency L0s <1us, L1 <64us
ClockPM- Surprise- LLActRep+ BwNot- ASPMOptComp-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt-
RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
RootCap: CRSVisible-
RootSta: PME ReqID 0000, PMEStatus- PMEPending-
DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported ARIFwd-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled ARIFwd-
LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-


> > Jun 02 14:55:30 computer kernel: usb 1-1.1: ath9k_htc: Transferred FW:
> > ath9k_htc/htc_7010-1.4.0.fw, size: 72812

$ ls -l /lib/firmware/ath9k_htc/htc_7010-1.4.0.fw
-rw-r--r-- 1 root root 72812 Dec 14 04:59 /lib/firmware/ath9k_htc/htc_7010-1.4.0.fw
$ sha1sum /lib/firmware/ath9k_htc/htc_7010-1.4.0.fw
959cb6550930de2882e12b9a549c3cf0c9bf51ac /lib/firmware/ath9k_htc/htc_7010-1.4.0.fw



--
Tobias PGP: http://8ef7ddba.uguu.de

Attachment: signature.asc
Description: Digital signature