Re: 6.11/regression/bisected - commit 1541d63c5fe2 made my system unbootable (general protection fault, probably for non-canonical address 0xdffffc00000000a9)

From: Mikhail Gavrilov
Date: Mon Jul 29 2024 - 15:52:25 EST


On Sun, Jul 21, 2024 at 10:20 PM Mikhail Gavrilov
<mikhail.v.gavrilov@xxxxxxxxx> wrote:
>
> Hi,
> The second Fedora update
> (kernel-debug-6.11.0-0.rc0.20240717git51835949dda3.5.fc41.x86_64) with
> the 6.11 kernel made my system unbootable.
> The trace looks like:
> Oops: general protection fault, probably for non-canonical address
> 0xdffffc00000000a9: 0000 [#1] PREEMPT SMP KASAN NOPTI
> KASAN: null-ptr-deref in range [0x0000000000000548-0x000000000000054f]
> CPU: 1 PID: 1472 Comm: NetworkManager Tainted: G W L
> 6.10.0-rc5-10-1541d63c5fe2cebce85b2af84a2850a302ffda9c+ #683
> Hardware name: ASUS System Product Name/ROG STRIX B650E-I GAMING WIFI,
> BIOS 2611 04/07/2024
> RIP: 0010:mt792x_remove_interface+0x299/0x6d0 [mt792x_lib]
> Code: 48 c1 e9 03 80 3c 11 00 0f 85 1c 03 00 00 48 ba 00 00 00 00 00
> fc ff df 4d 8b 70 18 49 8d be 48 05 00 00 48 89 f9 48 c1 e9 03 <80> 3c
> 11 00 0f 85 e4 02 00 00 4d 8b b6 48 05 00 00 48 ba 00 00 00
> RSP: 0018:ffffc90006b7ec28 EFLAGS: 00010216
> RAX: fffffffffffffffe RBX: ffff88829f1d6990 RCX: 00000000000000a9
> RDX: dffffc0000000000 RSI: 0000000000000008 RDI: 0000000000000548
> RBP: ffff8881d0f43320 R08: ffff88829f1d6e28 R09: fffff52000d6fd31
> R10: ffffc90006b7e98f R11: 0000000000000001 R12: ffff88829f1d6f00
> R13: ffff8881d0f4cc98 R14: 0000000000000000 R15: ffff8881d0f43ce0
> FS: 00007f61c2b1d540(0000) GS:ffff888fd7200000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000055d8b9f117f8 CR3: 00000001ab702000 CR4: 0000000000f50ef0
> PKRU: 55555554
> Call Trace:
> <TASK>
> ? __die_body.cold+0x19/0x27
> ? die_addr+0x46/0x70
> ? exc_general_protection+0x14f/0x250
> ? asm_exc_general_protection+0x26/0x30
> ? mt792x_remove_interface+0x299/0x6d0 [mt792x_lib]
> ? mt792x_remove_interface+0x174/0x6d0 [mt792x_lib]
> drv_remove_interface+0x203/0x490 [mac80211]
> ieee80211_do_stop+0xfed/0x2090 [mac80211]
> ? __pfx_ieee80211_do_stop+0x10/0x10 [mac80211]
> ? __pfx_lock_release+0x10/0x10
> ? mark_held_locks+0x94/0xe0
> ? _raw_spin_unlock_irqrestore+0x66/0x80
> ieee80211_stop+0x10b/0x720 [mac80211]
> __dev_close_many+0x1a0/0x2c0
> ? __pfx___dev_close_many+0x10/0x10
> ? mark_held_locks+0x94/0xe0
> ? __local_bh_enable_ip+0xaf/0x140
> __dev_change_flags+0x265/0x660
> ? __pfx___dev_change_flags+0x10/0x10
> dev_change_flags+0x80/0x160
> do_setlink+0x2668/0x33e0
> ? __pfx_lock_release+0x10/0x10
> ? __pfx_do_setlink+0x10/0x10
> ? arch_stack_walk+0x79/0x100
> ? __pfx_stack_trace_consume_entry+0x10/0x10
> ? is_bpf_text_address+0x6e/0x100
> ? kernel_text_address+0x145/0x160
> ? __kernel_text_address+0x12/0x40
> ? unwind_get_return_address+0x5e/0xa0
> ? arch_stack_walk+0xac/0x100
> ? __asan_memset+0x23/0x50
> ? __nla_validate_parse+0xb6/0x2670
> ? stack_trace_save+0x94/0xd0
> ? __pfx___nla_validate_parse+0x10/0x10
> ? stack_depot_save_flags+0x28/0x8f0
> __rtnl_newlink+0xb1d/0x1600
> ? __pfx___rtnl_newlink+0x10/0x10
> rtnl_newlink+0xc0/0x100
> rtnetlink_rcv_msg+0x2f3/0xb20
> ? __pfx_rtnetlink_rcv_msg+0x10/0x10
> ? __pfx___lock_acquire+0x10/0x10
> ? __pfx___lock_acquire+0x10/0x10
> netlink_rcv_skb+0x13d/0x3b0
> ? __pfx_rtnetlink_rcv_msg+0x10/0x10
> ? __pfx_netlink_rcv_skb+0x10/0x10
> ? netlink_deliver_tap+0xcb/0xaf0
> ? netlink_deliver_tap+0x14b/0xaf0
> netlink_unicast+0x42e/0x6e0
> ? __pfx_netlink_unicast+0x10/0x10
> ? __virt_addr_valid+0x228/0x420
> netlink_sendmsg+0x765/0xc20
> ? __pfx_netlink_sendmsg+0x10/0x10
> ? __import_iovec+0x399/0x690
> ? __pfx_netlink_sendmsg+0x10/0x10
> ____sys_sendmsg+0x97f/0xc60
> ? copy_msghdr_from_user+0x270/0x430
> ? __pfx_____sys_sendmsg+0x10/0x10
> ? __pfx_copy_msghdr_from_user+0x10/0x10
> ? __pfx___lock_acquire+0x10/0x10
> ___sys_sendmsg+0xfd/0x180
> ? __pfx____sys_sendmsg+0x10/0x10
> __sys_sendmsg+0x19c/0x220
> ? __pfx___sys_sendmsg+0x10/0x10
> ? ktime_get_coarse_real_ts64+0x41/0xd0
> do_syscall_64+0x97/0x190
> ? lockdep_hardirqs_on_prepare+0x171/0x400
> ? do_syscall_64+0xa3/0x190
> ? lockdep_hardirqs_on+0x7c/0x100
> ? do_syscall_64+0xa3/0x190
> ? do_user_addr_fault+0x4ce/0xad0
> ? local_clock_noinstr+0xd/0x100
> ? __pfx_lock_release+0x10/0x10
> ? handle_mm_fault+0x47d/0x8d0
> ? lockdep_hardirqs_on_prepare+0x171/0x400
> entry_SYSCALL_64_after_hwframe+0x76/0x7e
> RIP: 0033:0x7f61c392bb6b
> Code: 48 89 e5 48 83 ec 20 89 55 ec 48 89 75 f0 89 7d f8 e8 c9 5b f7
> ff 8b 55 ec 48 8b 75 f0 41 89 c0 8b 7d f8 b8 2e 00 00 00 0f 05 <48> 3d
> 00 f0 ff ff 77 2d 44 89 c7 48 89 45 f8 e8 21 5c f7 ff 48 8b
> RSP: 002b:00007ffc9a9ece90 EFLAGS: 00000293 ORIG_RAX: 000000000000002e
> RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f61c392bb6b
> RDX: 0000000000000000 RSI: 00007ffc9a9eced0 RDI: 000000000000000d
> RBP: 00007ffc9a9eceb0 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000293 R12: 000055d8b9e83850
> R13: 0000000000000010 R14: 00007ffc9a9ed06c R15: 0000000000000000
> </TASK>
> Modules linked in: nf_conntrack_netbios_ns nf_conntrack_broadcast
> nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet
> nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat
> nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables qrtr bnep
> sunrpc binfmt_misc amd_atl intel_rapl_msr intel_rapl_common mt7921e
> mt7921_common mt792x_lib mt76_connac_lib mt76 edac_mce_amd vfat btusb
> snd_hda_codec_hdmi fat btrtl mac80211 uvcvideo snd_hda_intel
> snd_usb_audio(+) btintel snd_intel_dspcfg snd_intel_sdw_acpi btbcm
> snd_hda_codec kvm_amd btmtk uvc videobuf2_vmalloc snd_usbmidi_lib
> videobuf2_memops snd_hda_core videobuf2_v4l2 snd_ump bluetooth
> snd_rawmidi snd_hwdep videobuf2_common snd_seq kvm videodev libarc4
> snd_seq_device asus_nb_wmi eeepc_wmi joydev mc snd_pcm cfg80211
> asus_wmi apple_mfi_fastcharge snd_timer sparse_keymap rapl pcspkr
> platform_profile wmi_bmof snd soundcore igc k10temp rfkill i2c_piix4
> gpio_amdpt gpio_generic loop nfnetlink zram hid_apple amdgpu amdxcp
> i2c_algo_bit
> drm_ttm_helper ttm crct10dif_pclmul crc32_pclmul drm_exec
> crc32c_intel gpu_sched polyval_clmulni polyval_generic
> drm_suballoc_helper drm_buddy nvme drm_display_helper
> ghash_clmulni_intel nvme_core sha512_ssse3 ccp sha256_ssse3 cec
> sha1_ssse3 sp5100_tco nvme_auth video wmi ip6_tables ip_tables fuse
> ---[ end trace 0000000000000000 ]---
>
> Bisect is pointed to commit 1541d63c5fe2cebce85b2af84a2850a302ffda9c
> Author: Sean Wang <sean.wang@xxxxxxxxxxxx>
> Date: Wed Jun 12 20:02:40 2024 -0700
>
> wifi: mt76: mt7925: add mt7925_mac_link_bss_remove to remove per-link BSS
>
> The mt7925_mac_link_bss_remove function currently removes the per-link BSS.
> We will extend this function when we implement the MLO functionality.
>
> This patch only includes structural changes and does not involve any
> logic changes.
>
> Signed-off-by: Sean Wang <sean.wang@xxxxxxxxxxxx>
> Link: https://patch.msgid.link/20240613030241.5771-47-sean.wang@xxxxxxxxxx
> Signed-off-by: Felix Fietkau <nbd@xxxxxxxx>
>
> drivers/net/wireless/mediatek/mt76/mt792x_core.c | 35
> +++++++++++++++++++++--------------
> 1 file changed, 21 insertions(+), 14 deletions(-)
>
> Unfortunately, I can't check the revert commit 1541d63c5fe2 because of
> conflicts.
>
>
> > git reset d67978318827d06f1c0fa4c31343a279e9df6fde --hard
> Updating files: 100% (9962/9962), done.
> HEAD is now at d67978318827 Merge tag 'x86_cpu_for_v6.11_rc1' of
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
>
> > git revert -n 1541d63c5fe2cebce85b2af84a2850a302ffda9c
> Auto-merging drivers/net/wireless/mediatek/mt76/mt792x_core.c
> CONFLICT (content): Merge conflict in
> drivers/net/wireless/mediatek/mt76/mt792x_core.c
> error: could not revert 1541d63c5fe2... wifi: mt76: mt7925: add
> mt7925_mac_link_bss_remove to remove per-link BSS
> hint: after resolving the conflicts, mark the corrected paths
> hint: with 'git add <paths>' or 'git rm <paths>'
> hint: Disable this message with "git config advice.mergeConflict false"
>
> I also attach here a full kernel log and build config.
>
> My hardware specs are: https://linux-hardware.org/?probe=f95b7a2fb5
>
> Sean, can you look into this, please?


Excuse me, but I can't continue testing 6.11.
This is a blocker bug for me.
And it is still not fixed in 6.11-rc1.

--
Best Regards,
Mike Gavrilov.

Attachment: dmesg-6.11.0-0.rc1.20240729gitdc1c8034e31b.16.fc41.x86_64+debug.zip
Description: Zip archive