Re: [PATCH v2 12/32] mm/vmalloc: vmalloc_to_page() use pte_offset_kernel()

From: Lorenzo Stoakes
Date: Mon Jul 10 2023 - 13:18:35 EST


On Mon, Jul 10, 2023 at 03:42:31PM +0100, Mark Brown wrote:
> On Thu, Jun 08, 2023 at 06:21:41PM -0700, Hugh Dickins wrote:
> > vmalloc_to_page() was using pte_offset_map() (followed by pte_unmap()),
> > but it's intended for userspace page tables: prefer pte_offset_kernel().
> >
> > Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx>
> > Reviewed-by: Lorenzo Stoakes <lstoakes@xxxxxxxxx>
>
> Currently Linus' tree is reliably failing to boot on pine64plus, an
> arm64 SBC. Most other boards seem fine, though I am seeing some
> additional instability on Tritium which is another Allwinner platform,
> I've not dug into that yet and Tritium is generally less stable.
>
> We end up seeing NULL or otherwise bad pointer dereferences, the
> specific error does vary a bit though it mostly appears to be in the
> pinctrl code. A bisect (full log below) identified this patch as
> introducing the failure, nothing is jumping out at me about the patch
> and it's not affecting everything so I'd not be surprised if it's just
> unconvering some bug in the platform support but I'm not super familiar
> with the code.

Yeah seems likely. Do you have a .config you can share for this board? For
a 64-bit device you'd expect that this change would probably be a nop.

>
> Sample backtrace:
>
> [ 1.919725] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> [ 1.928551] Mem abort info:
> [ 1.931359] ESR = 0x0000000096000044
>
> ...
>
> [ 1.968870] [0000000000000000] user address but active_mm is swapper
>
> ...
>
> [ 2.093969] Call trace:
> [ 2.096414] dt_remember_or_free_map+0xc8/0x120
> [ 2.100949] pinctrl_dt_to_map+0x23c/0x364
> [ 2.105050] create_pinctrl+0x68/0x3ec
> [ 2.108803] pinctrl_get+0xb0/0x124
> [ 2.112294] devm_pinctrl_get+0x48/0x90
> [ 2.116133] pinctrl_bind_pins+0x58/0x158
> [ 2.120148] really_probe+0x54/0x2b0
> [ 2.123724] __driver_probe_device+0x78/0x12c
>
> Another common theme is the same but with an address like 0x4c and:
>
> [ 2.098328] __kmem_cache_alloc_node+0x1bc/0x2dc
> [ 2.102947] kmalloc_trace+0x20/0x2c
> [ 2.106524] pinctrl_register_mappings+0x98/0x178
>
> Full boot log from a failure:
>
> https://lava.sirena.org.uk/scheduler/job/712456
>
> git bisect start
> # bad: [06c2afb862f9da8dc5efa4b6076a0e48c3fbaaa5] Linux 6.5-rc1
> git bisect bad 06c2afb862f9da8dc5efa4b6076a0e48c3fbaaa5
> # good: [6995e2de6891c724bfeb2db33d7b87775f913ad1] Linux 6.4
> git bisect good 6995e2de6891c724bfeb2db33d7b87775f913ad1
> # bad: [1b722407a13b7f8658d2e26917791f32805980a2] Merge tag 'drm-next-2023-06-29' of git://anongit.freedesktop.org/drm/drm
> git bisect bad 1b722407a13b7f8658d2e26917791f32805980a2
> # bad: [3a8a670eeeaa40d87bd38a587438952741980c18] Merge tag 'net-next-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
> git bisect bad 3a8a670eeeaa40d87bd38a587438952741980c18
> # bad: [6e17c6de3ddf3073741d9c91a796ee696914d8a0] Merge tag 'mm-stable-2023-06-24-19-15' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
> git bisect bad 6e17c6de3ddf3073741d9c91a796ee696914d8a0
> # good: [2605e80d3438c77190f55b821c6575048c68268e] Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
> git bisect good 2605e80d3438c77190f55b821c6575048c68268e
> # good: [72dc6db7e3b692f46f3386b8dd5101d3f431adef] Merge tag 'wq-for-6.5-cleanup-ordered' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
> git bisect good 72dc6db7e3b692f46f3386b8dd5101d3f431adef
> # bad: [179d3e4f3bfa5947821c1b1bc6aa49a4797b7f21] mm/madvise: clean up force_shm_swapin_readahead()
> git bisect bad 179d3e4f3bfa5947821c1b1bc6aa49a4797b7f21
> # good: [523716770e63e229dbb6307d663f03d990dfefc5] maple_tree: rework mtree_alloc_{range,rrange}()
> git bisect good 523716770e63e229dbb6307d663f03d990dfefc5
> # good: [b764253c18821da31c49a260f92f5d093cf1637e] selftests/mm: fix "warning: expression which evaluates to zero..." in mlock2-tests.c
> git bisect good b764253c18821da31c49a260f92f5d093cf1637e
> # good: [5c7f3bf04a6cf266567fdea1ae4987875e92619f] s390: allow pte_offset_map_lock() to fail
> git bisect good 5c7f3bf04a6cf266567fdea1ae4987875e92619f
> # good: [0d940a9b270b9220dcff74d8e9123c9788365751] mm/pgtable: allow pte_offset_map[_lock]() to fail
> git bisect good 0d940a9b270b9220dcff74d8e9123c9788365751
> # bad: [0d1c81edc61e553ed7a5db18fb8074c8b78e1538] mm/vmalloc: vmalloc_to_page() use pte_offset_kernel()
> git bisect bad 0d1c81edc61e553ed7a5db18fb8074c8b78e1538
> # good: [2798bbe75b9c2752b46d292e5c2a49f49da36418] mm/page_vma_mapped: pte_offset_map_nolock() not pte_lockptr()
> git bisect good 2798bbe75b9c2752b46d292e5c2a49f49da36418
> # good: [be872f83bf571f4f9a0ac25e2c9c36e905a36619] mm/pagewalk: walk_pte_range() allow for pte_offset_map()
> git bisect good be872f83bf571f4f9a0ac25e2c9c36e905a36619
> # good: [e5ad581c7f1c32d309ae4e895eea0cd1a3d9f363] mm/vmwgfx: simplify pmd & pud mapping dirty helpers
> git bisect good e5ad581c7f1c32d309ae4e895eea0cd1a3d9f363
> # first bad commit: [0d1c81edc61e553ed7a5db18fb8074c8b78e1538] mm/vmalloc: vmalloc_to_page() use pte_offset_kernel()