Re: [PATCH v2 3/3] PCI: imx6: limit DBI register length

From: Stefan Agner
Date: Wed Nov 28 2018 - 07:45:21 EST


On 28.11.2018 02:28, Andrey Smirnov wrote:
> On Tue, Nov 27, 2018 at 5:12 PM Fabio Estevam <festevam@xxxxxxxxx> wrote:
>>
>> Hi Andrey,
>>
>> On Tue, Nov 27, 2018 at 10:57 PM Andrey Smirnov
>> <andrew.smirnov@xxxxxxxxx> wrote:
>>
>> > Could this be a regression? Prior to 415b6185c541 ("PCI: imx6: Fix
>> > config read timeout handling") all of the imprecise aborts were caught
>> > and handled via no-op handler. I did an experiment on i.MX6Q board
>> > that I have (ZII RDU2) and adding a simple no-op for imprecise aborts
>> > via
>> >
>> > hook_fault_code(16 + 6, imx6q_pcie_no_op_handler, SIGBUS, 0,
>> > "imprecise external abort");

Unsurprisingly, introducing this handler also "fixes" the issue in my
setup.

FWIW, during my investigation with the Thumb2 issue, I was looking at
the abort handler and its history a bit more closely too. I was about to
suggest readding this handler too, you just beat me by some hours :-)

The current 4.9 downstream BSP still has the old fault handler, and
hence this issue does not happen in the downstream BSP.

>> >
>> > seems to "resolve" this problem:
>>
>> Please check https://patchwork.kernel.org/patch/9720313/
>>
>> This commit fixed a kernel crash on mx6q boards with a PCI switch.
>>
>> So we can't go back to the simple no-op.
>
> It's probably not exactly clear form my message, but I wasn't
> proposing to go back to a no-op. What I had in mind is having a no-op
> handler for imprecise aborts _alongside_ the non-linefetch handlers
> that is already there when running against i.MX6Q type of the IP
> block.
>

Agreed, it should be alongside the "external abort on non-linefetch"
handler.

I actually encountered another issue when I had a Intel e1000e running
yesterday. Unfortunately I wasn't able to reproduce the issue, so maybe
it was just a fluke. It probably would be solved by the additional
"imprecise external abort" too:

[ 37.644300] fec 2188000.ethernet eth0: Link is Down
[ 38.077383] Unhandled fault: imprecise external abort (0x1406) at
0xb64e8000
[ 38.084638] pgd = ac4709d6
[ 38.087434] [b64e8000] *pgd=00000000
[ 38.091129] Internal error: : 1406 [#1] PREEMPT SMP ARM
[ 38.096508] CPU: 0 PID: 468 Comm: kworker/0:2 Not tainted
4.19.4-00044-ged7a0cc2ef01-dirty #479
[ 38.105428] Hardware name: Freescale i.MX6 Quad/DualLite (Device
Tree)
[ 38.112143] Workqueue: events e1000_watchdog_task
[ 38.116993] PC is at e1000e_update_stats+0x68/0xa7c
[ 38.122008] LR is at e1000_watchdog_task+0xe8/0x71c
[ 38.127021] pc : [<c0621238>] lr : [<c0628d0c>] psr: 60010013
[ 38.133449] sp : ed185ea0 ip : 00007374 fp : ec83ece4
[ 38.138814] r10: ec71f700 r9 : ec83c500 r8 : ec83c000
[ 38.144180] r7 : ec83c924 r6 : ec83c000 r5 : c1104cc8 r4 :
ec83c500
[ 38.150875] r3 : f14c4000 r2 : 000003e8 r1 : 00000000 r0 :
ec83c500
[ 38.157573] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM
Segment none
[ 38.164890] Control: 10c5387d Table: 3d1c004a DAC: 00000051
[ 38.170789] Process kworker/0:2 (pid: 468, stack limit = 0xbc71b316)
[ 38.177306] Stack: (0xed185ea0 to 0xed186000)
[ 38.181794] 5ea0: ef7a9100 c0619ba3 ec0870e0 ec087068 ec0870e0
00000000 60010013 c0619ba3
[ 38.190184] 5ec0: ef7a8d00 ec83c54c c1104cc8 ec83e54c ec83c924
ec83c000 ec83c500 ec71f700
[ 38.198573] 5ee0: ec83ece4 c0628d0c ecbd16c0 c0c0237c ed185f3c
c0b26de0 ec131c04 c0619ba3
[ 38.206966] 5f00: c1153fe4 ec83c54c ecdcc100 ef7a8d00 ef7a9e00
00000000 ec83c550 00000000
[ 38.221770] 5f20: ef7a8d00 c0136aec c1103d00 ef7a8d18 ecdcc100
ef7a8d00 ecdcc114 c1103d00
[ 38.236801] 5f40: ef7a8d18 ffffe000 00000008 c0136d40 ec521c70
c1176068 c0e4213c ed184000
[ 38.251874] 5f60: ecfecfdc ecfecfc0 ed0db1c0 00000000 ed184000
ecdcc100 c0136cfc ec0a3ea4
[ 38.267199] 5f80: ecfecfdc c013c810 00000000 ed0db1c0 c013c6c8
00000000 00000000 00000000
[ 38.282612] 5fa0: 00000000 00000000 00000000 c01010e8 00000000
00000000 00000000 00000000
[ 38.298080] 5fc0: 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000
[ 38.313792] 5fe0: 00000000 00000000 00000000 00000000 00000013
00000000 00000000 00000000
[ 38.329716] [<c0621238>] (e1000e_update_stats) from [<c0628d0c>]
(e1000_watchdog_task+0xe8/0x71c)
[ 38.346597] [<c0628d0c>] (e1000_watchdog_task) from [<c0136aec>]
(process_one_work+0x1f0/0x400)
[ 38.363493] [<c0136aec>] (process_one_work) from [<c0136d40>]
(worker_thread+0x44/0x584)
[ 38.379844] [<c0136d40>] (worker_thread) from [<c013c810>]
(kthread+0x148/0x150)
[ 38.395575] [<c013c810>] (kthread) from [<c01010e8>]
(ret_from_fork+0x14/0x2c)
[ 38.411119] Exception stack(0xed185fb0 to 0xed185ff8)
[ 38.420331] 5fa0: 00000000
00000000 00000000 00000000
[ 38.436662] 5fc0: 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000
[ 38.452933] 5fe0: 00000000 00000000 00000000 00000000 00000013
00000000
[ 38.463661] Code: e590641c e2833901 e5931000 f57ff04f (e280ad9f)

--
Stefan