Re: Regression: QCA6390 fails with "mm/page_alloc: place pages to tail in __free_pages_core()"

From: wi nk
Date: Fri Nov 13 2020 - 10:50:15 EST


On Fri, Nov 13, 2020 at 2:56 PM David Hildenbrand <david@xxxxxxxxxx> wrote:
>
> On 13.11.20 14:36, wi nk wrote:
> > On Fri, Nov 13, 2020 at 1:52 PM Pavel Procopiuc
> > <pavel.procopiuc@xxxxxxxxx> wrote:
> >>
> >> Op 13.11.2020 om 12:08 schreef Carl Huang:
> >>> Checked some logs. Looks when the error happens, the physical address are
> >>> very small. Its' between 20M - 30M.
> >>>
> >>> So could you have a try to reserve the memory starting from 20M?
> >>> Add "memmap=10M\$20M" to your grub.cfg or edit in kernel parameters. so ath11k
> >>> can't allocate from these address.
> >>>
> >>> Or you can try to reserve even larger memory starting from 20M.
> >>
> >> That worked, booting with memmap=12M$20M resulted in the working wifi:
> >>
> >> $ journalctl -b | grep -iP '05:00|ath11k|Linux version|memmap'
> >> Nov 13 13:45:34 razor kernel: Linux version 5.10.0-rc2 (root@razor) (gcc (Gentoo 9.3.0-r1 p3) 9.3.0, GNU ld (Gentoo 2.34
> >> p6) 2.34.0) #1 SMP Fri Nov 13 13:29:48 CET 2020
> >> Nov 13 13:45:34 razor kernel: Command line: ro root=/dev/nvme0n1p2 resume=/dev/nvme1n1p1 zram.num_devices=2
> >> memmap=12M$20M quiet
> >> Nov 13 13:45:34 razor kernel: DMA zone: 64 pages used for memmap
> >> Nov 13 13:45:34 razor kernel: DMA32 zone: 5165 pages used for memmap
> >> Nov 13 13:45:34 razor kernel: Normal zone: 255840 pages used for memmap
> >> Nov 13 13:45:34 razor kernel: Kernel command line: ro root=/dev/nvme0n1p2 resume=/dev/nvme1n1p1 zram.num_devices=2
> >> memmap=12M$20M quiet ro root=/dev/nvme0n1p2 resume=/dev/nvme1n1p1 zram.num_devices=2 memmap=12M$20M quiet
> >> Nov 13 13:45:34 razor kernel: pci 0000:05:00.0: [17cb:1101] type 00 class 0x028000
> >> Nov 13 13:45:34 razor kernel: pci 0000:05:00.0: reg 0x10: [mem 0xd2100000-0xd21fffff 64bit]
> >> Nov 13 13:45:34 razor kernel: pci 0000:05:00.0: PME# supported from D0 D3hot D3cold
> >> Nov 13 13:45:34 razor kernel: pci 0000:05:00.0: 4.000 Gb/s available PCIe bandwidth, limited by 5.0 GT/s PCIe x1 link at
> >> 0000:00:1c.1 (capable of 7.876 Gb/s with 8.0 GT/s PCIe x1 link)
> >> Nov 13 13:45:34 razor kernel: pci 0000:05:00.0: Adding to iommu group 21
> >> Nov 13 13:45:35 razor kernel: ath11k_pci 0000:05:00.0: WARNING: ath11k PCI support is experimental!
> >> Nov 13 13:45:35 razor kernel: ath11k_pci 0000:05:00.0: BAR 0: assigned [mem 0xd2100000-0xd21fffff 64bit]
> >> Nov 13 13:45:35 razor kernel: ath11k_pci 0000:05:00.0: enabling device (0000 -> 0002)
> >> Nov 13 13:45:35 razor kernel: ath11k_pci 0000:05:00.0: MSI vectors: 32
> >> Nov 13 13:45:35 razor kernel: mhi 0000:05:00.0: Requested to power ON
> >> Nov 13 13:45:35 razor kernel: mhi 0000:05:00.0: Power on setup success
> >> Nov 13 13:45:35 razor kernel: ath11k_pci 0000:05:00.0: Respond mem req failed, result: 1, err: 0
> >> Nov 13 13:45:35 razor kernel: ath11k_pci 0000:05:00.0: qmi failed to respond fw mem req:-22
> >> Nov 13 13:45:35 razor kernel: ath11k_pci 0000:05:00.0: req mem_seg[0] 0x2100000 524288 1
> >> Nov 13 13:45:35 razor kernel: ath11k_pci 0000:05:00.0: req mem_seg[1] 0x2180000 524288 1
> >> Nov 13 13:45:35 razor kernel: ath11k_pci 0000:05:00.0: req mem_seg[2] 0x2200000 524288 1
> >> Nov 13 13:45:35 razor kernel: ath11k_pci 0000:05:00.0: req mem_seg[3] 0x2280000 294912 1
> >> Nov 13 13:45:35 razor kernel: ath11k_pci 0000:05:00.0: req mem_seg[4] 0x2300000 524288 1
> >> Nov 13 13:45:35 razor kernel: ath11k_pci 0000:05:00.0: req mem_seg[5] 0x2380000 524288 1
> >> Nov 13 13:45:35 razor kernel: ath11k_pci 0000:05:00.0: req mem_seg[6] 0x2400000 458752 1
> >> Nov 13 13:45:35 razor kernel: ath11k_pci 0000:05:00.0: req mem_seg[7] 0x20c0000 131072 1
> >> Nov 13 13:45:35 razor kernel: ath11k_pci 0000:05:00.0: req mem_seg[8] 0x2480000 524288 4
> >> Nov 13 13:45:35 razor kernel: ath11k_pci 0000:05:00.0: req mem_seg[9] 0x2500000 360448 4
> >> Nov 13 13:45:35 razor kernel: ath11k_pci 0000:05:00.0: req mem_seg[10] 0x20a4000 16384 1
> >> Nov 13 13:45:35 razor kernel: ath11k_pci 0000:05:00.0: chip_id 0x0 chip_family 0xb board_id 0xff soc_id 0xffffffff
> >> Nov 13 13:45:35 razor kernel: ath11k_pci 0000:05:00.0: fw_version 0x101c06cc fw_build_timestamp 2020-06-24 19:50
> >> fw_build_id
> >> Nov 13 13:45:37 razor NetworkManager[782]: <info> [1605271537.1168] rfkill1: found Wi-Fi radio killswitch (at
> >> /sys/devices/pci0000:00/0000:00:1c.1/0000:05:00.0/ieee80211/phy0/rfkill1) (driver ath11k_pci)
> >> Nov 13 13:45:39 razor ModemManager[722]: <info> Couldn't check support for device
> >> '/sys/devices/pci0000:00/0000:00:1c.1/0000:05:00.0': not supported by any plugin
> >> Nov 13 13:45:45 razor kernel: ath11k_pci 0000:05:00.0: failed to enqueue rx buf: -28
> >>
> >> --
> >> ath11k mailing list
> >> ath11k@xxxxxxxxxxxxxxxxxxx
> >> http://lists.infradead.org/mailman/listinfo/ath11k
> >
> > When I attempt to boot my 5.10rc2 kernel with that memmap option, my
> > machine immediately hangs. That said, it seems to have done something
> > bizarre, as immediately afterwards, if I remove that option and let
> > 5.10 boot normally, it seems to boot and bring up the wifi adapter ok
> > (which didn't happen before). Now that I've managed to boot 5.10
> > twice, the first time after a couple of minutes my video started going
> > nuts and displaying all sorts of artifacts[1]. This time things seem
> > to be functioning nominally (wifi is online and the machine is
> > behaving properly). I may just never turn it off again :D.
>
> Honestly, that FW sounds horribly flawed. :)
>
> Would be interesting what happens when you boot back to 5.9 now ...
>
> --
> Thanks,
>
> David / dhildenb
>

Well nothing super interesting....rebooting to 5.9 hard locked the
machine once the adapter associated, before I could do much.
Rebooting back to 5.10 and it booted fine (I'm sending this email with
it). There's definitely something non deterministic causing the
driver to work occasionally and fail/panic a bit more often. Are
there other memory / device allocation settings I can tweak to see if
something settles it down?