BUG: net-next (7.0-rc6 based and later) fails to boot on Jetson Xavier NX
From: Russell King (Oracle)
Date: Wed Apr 08 2026 - 09:10:41 EST
Hi,
Just a heads-up that current net-next (v7.0-rc6 based) fails to boot on
my nVidia Jetson Xavier platform. v7.0-rc5 and v6.14 based net-next both
boot fine. This is an arm64 platform.
The problem appears to be completely random in terms of its symptoms,
and looks like severe memory corruption - every boot seems to produce
a different problem. The common theme is, although the kernel gets to
userspace, it never gets anywhere close to a login prompt before
failing in some way.
The last net-next+ boot (which is currently v7.0-rc6 based) resulted
in:
tegra-mc 2c00000.memory-controller: xusb_hostw: secure write @0x00000003ffffff00: VPR violation ((null))
...
irq 91: nobody cared (try booting with the "irqpoll" option)
...
depmod: ERROR: could not open directory /lib/modules/7.0.0-rc6-net-next+: No such file or directory
...
Unable to handle kernel paging request at virtual address 0003201fd50320cf
A previous boot of the exact same kernel didn't oops, but was unable
to find the block device to mount for /mnt via block UUID.
A previous boot to that resulted in an oops.
The intersting thing is - the depmod error above is incorrect:
root@tegra-ubuntu:~# ls -ld /lib/modules/7.0.0-rc6-net-next+
drwxrwxr-x 3 root root 4096 Apr 8 10:23 /lib/modules/7.0.0-rc6-net-next+
The directory is definitely there, and is readable - checked after
booting back into net-next based on 7.0-rc5. In some of these boots,
stmmac hasn't probed yet, which rules out my changes.
Rootfs is ext4, and it seems there were a lot of ext4 commits merged
between rc5 and rc6, but nothing for rc7.
My current net-next head is dfecb0c5af3b. Merging rc7 on top also
fails, I suspect also randomly, with that I just got:
EXT4-fs (mmcblk0p1): VFS: Can't find ext4 filesystem
mount: /mnt: wrong fs type, bad option, bad superblock on /dev/mmcblk0p1, missing codepage or helper program, or other error.
mount: /mnt/: can't find PARTUUID=741c0777-391a-4bce-a222-455e180ece2a.
Unable to handle kernel paging request at virtual address f9bf0011ac0fb893
Mem abort info:
ESR = 0x0000000096000004
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x04: level 0 translation fault
Data abort info:
ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
CM = 0, WnR = 0, TnD = 0, TagAccess = 0
GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[f9bf0011ac0fb893] address between user and kernel address ranges
Internal error: Oops: 0000000096000004 [#1] SMP
Modules linked in:
CPU: 1 UID: 0 PID: 936 Comm: mount Not tainted 7.0.0-rc7-net-next+ #649 PREEMPT
Hardware name: NVIDIA NVIDIA Jetson Xavier NX Developer Kit/Jetson, BIOS 6.0-37391689 08/28/2024
pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : refill_objects+0x298/0x5ec
lr : refill_objects+0x1f0/0x5ec
...
Call trace:
refill_objects+0x298/0x5ec (P)
__pcs_replace_empty_main+0x13c/0x3a8
kmem_cache_alloc_noprof+0x324/0x3a0
alloc_iova+0x3c/0x290
alloc_iova_fast+0x168/0x2d4
iommu_dma_alloc_iova+0x84/0x154
iommu_dma_map_sg+0x2c4/0x538
__dma_map_sg_attrs+0x124/0x2c0
dma_map_sg_attrs+0x10/0x20
sdhci_pre_dma_transfer+0xb8/0x164
sdhci_pre_req+0x38/0x44
mmc_blk_mq_issue_rq+0x3dc/0x920
mmc_mq_queue_rq+0x104/0x2b0
__blk_mq_issue_directly+0x38/0xb0
blk_mq_request_issue_directly+0x54/0xb4
blk_mq_issue_direct+0x84/0x180
blk_mq_dispatch_queue_requests+0x1a8/0x2e0
blk_mq_flush_plug_list+0x60/0x140
__blk_flush_plug+0xe0/0x11c
blk_finish_plug+0x38/0x4c
read_pages+0x158/0x260
page_cache_ra_unbounded+0x158/0x3e0
force_page_cache_ra+0xb0/0xe4
page_cache_sync_ra+0x88/0x480
filemap_get_pages+0xd8/0x850
filemap_read+0xdc/0x3d8
blkdev_read_iter+0x84/0x198
vfs_read+0x208/0x2d8
ksys_read+0x58/0xf4
__arm64_sys_read+0x1c/0x28
invoke_syscall.constprop.0+0x50/0xe0
do_el0_svc+0x40/0xc0
el0_svc+0x48/0x2a0
el0t_64_sync_handler+0xa0/0xe4
el0t_64_sync+0x19c/0x1a0
Code: 54000189 f9000022 aa0203e4 b9402ae3 (f8634840)
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Oops: Fatal exception
Looking at the changes between rc5 and rc6, there's one drivers/block
change for zram (which is used on this platform), one change in
drivers/base for regmap, nothing for drivers/mmc, but plenty for
fs/ext4. There are five DMA API changes.
Now building straight -rc7. If that also fails, my plan is to start
bisecting rc5..rc6, which will likely take most of the rest of the
day. So, in the mean time I'm sending this as a heads-up that rc6
and onwards has a problem.
I'll update when I have a potential commit located.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!