Re:[PATCH] mm/x86/pat: Only untrack the pfn range if unmap region

From: David Wang
Date: Sun Jul 14 2024 - 07:02:25 EST



At 2024-07-12 22:42:44, "Peter Xu" <peterx@xxxxxxxxxx> wrote:
>NOTE: I massaged the commit message comparing to the rfc post [1], the
>patch itself is untouched. Also removed rfc tag, and added more people
>into the loop. Please kindly help test this patch if you have a reproducer,
>as I can't reproduce it myself even with the syzbot reproducer on top of
>mm-unstable. Instead of further check on the reproducer, I decided to send
>this out first as we have a bunch of reproducers on the list now..
>---
> mm/memory.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
>diff --git a/mm/memory.c b/mm/memory.c
>index 4bcd79619574..f57cc304b318 100644
>--- a/mm/memory.c
>+++ b/mm/memory.c
>@@ -1827,9 +1827,6 @@ static void unmap_single_vma(struct mmu_gather *tlb,
> if (vma->vm_file)
> uprobe_munmap(vma, start, end);
>
>- if (unlikely(vma->vm_flags & VM_PFNMAP))
>- untrack_pfn(vma, 0, 0, mm_wr_locked);
>-
> if (start != end) {
> if (unlikely(is_vm_hugetlb_page(vma))) {
> /*
>@@ -1894,6 +1891,8 @@ void unmap_vmas(struct mmu_gather *tlb, struct ma_state *mas,
> unsigned long start = start_addr;
> unsigned long end = end_addr;
> hugetlb_zap_begin(vma, &start, &end);
>+ if (unlikely(vma->vm_flags & VM_PFNMAP))
>+ untrack_pfn(vma, 0, 0, mm_wr_locked);
> unmap_single_vma(tlb, vma, start, end, &details,
> mm_wr_locked);
> hugetlb_zap_end(vma, &details);
>--
>2.45.0

Hi,

Today, I notice a kernel warning with this patch.


[Sun Jul 14 16:51:38 2024] OOM killer enabled.
[Sun Jul 14 16:51:38 2024] Restarting tasks ... done.
[Sun Jul 14 16:51:38 2024] random: crng reseeded on system resumption
[Sun Jul 14 16:51:38 2024] PM: suspend exit
[Sun Jul 14 16:51:38 2024] ------------[ cut here ]------------
[Sun Jul 14 16:51:38 2024] WARNING: CPU: 1 PID: 2484 at arch/x86/mm/pat/memtype.c:1002 untrack_pfn+0x10c/0x120
[Sun Jul 14 16:51:38 2024] Modules linked in: snd_seq_dummy(E) snd_hrtimer(E) snd_seq(E) ctr(E) ccm(E) nf_conntrack_netlink(E) xfrm_user(E) xfrm_algo(E) xt_addrtype(E) br_netfilter(E) xt_CHECKSUM(E) xt_MASQUERADE(E) xt_conntrack(E) ipt_REJECT(E) nf_reject_ipv4(E) xt_tcpudp(E) nft_compat(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) nf_tables(E) nfnetlink(E) bridge(E) stp(E) llc(E) overlay(E) binfmt_misc(E) nls_ascii(E) nls_cp437(E) vfat(E) fat(E) amd_atl(E) intel_rapl_msr(E) intel_rapl_common(E) nvidia_drm(POE) nvidia_modeset(POE) edac_mce_amd(E) kvm_amd(E) snd_hda_codec_realtek(E) kvm(E) iwlmvm(E) snd_hda_codec_generic(E) crct10dif_pclmul(E) snd_hda_scodec_component(E) snd_hda_codec_hdmi(E) ghash_clmulni_intel(E) sha512_ssse3(E) mac80211(E) sha512_generic(E) snd_hda_intel(E) nvidia(POE) sha256_ssse3(E) snd_intel_dspcfg(E) ppdev(E) sha1_ssse3(E) libarc4(E) snd_hda_codec(E) snd_usb_audio(E) snd_usbmidi_lib(E) uvcvideo(E) snd_hda_core(E) iwlwifi(E) aesni_intel(E) snd_rawmidi(E) snd_pcsp(E)
[Sun Jul 14 16:51:38 2024]  snd_hwdep(E) snd_seq_device(E) crypto_simd(E) videobuf2_vmalloc(E) snd_pcm(E) cryptd(E) uvc(E) videobuf2_memops(E) videobuf2_v4l2(E) snd_timer(E) rapl(E) cfg80211(E) k10temp(E) wmi_bmof(E) sp5100_tco(E) acpi_cpufreq(E) ccp(E) snd(E) videodev(E) drm_kms_helper(E) videobuf2_common(E) rfkill(E) video(E) rng_core(E) mc(E) soundcore(E) joydev(E) parport_pc(E) parport(E) sg(E) evdev(E) msr(E) loop(E) fuse(E) drm(E) efi_pstore(E) dm_mod(E) configfs(E) ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc16(E) mbcache(E) jbd2(E) btrfs(E) blake2b_generic(E) efivarfs(E) raid10(E) raid456(E) async_raid6_recov(E) async_memcpy(E) async_pq(E) async_xor(E) async_tx(E) raid1(E) raid0(E) md_mod(E) hid_generic(E) usbhid(E) hid(E) sd_mod(E) ahci(E) libahci(E) xhci_pci(E) nvme(E) libata(E) crc32_pclmul(E) nvme_core(E) xhci_hcd(E) t10_pi(E) crc32c_intel(E) i2c_piix4(E) r8169(E) crc64_rocksoft(E) realtek(E) scsi_mod(E) usbcore(E) scsi_common(E) usb_common(E) wmi(E) gpio_amdpt(E) gpio_generic(E) button(E)
[Sun Jul 14 16:51:38 2024] CPU: 1 PID: 2484 Comm: gnome-shell Tainted: P           OE      6.10.0-rc7-linan-1 #283
[Sun Jul 14 16:51:38 2024] Hardware name: Micro-Star International Co., Ltd. MS-7B89/B450M MORTAR MAX (MS-7B89), BIOS 2.80 06/10/2020
[Sun Jul 14 16:51:38 2024] RIP: 0010:untrack_pfn+0x10c/0x120
[Sun Jul 14 16:51:38 2024] Code: e2 01 74 22 8b 98 e0 00 00 00 3b 5d 2c 74 ac 48 8b 7d 30 e8 66 e1 bc 00 89 5d 2c 48 8b 7d 30 e8 0a 6c 09 00 eb 95 0f 0b eb da <0f> 0b eb 95 e8 db b6 bb 00 66 66 2e 0f 1f 84 00 00 00 00 00 90 90
[Sun Jul 14 16:51:38 2024] RSP: 0018:ffffae5b4ab1fbe8 EFLAGS: 00010202
[Sun Jul 14 16:51:38 2024] RAX: 0000000000000028 RBX: 0000000000000000 RCX: 0000000000000000
[Sun Jul 14 16:51:38 2024] RDX: 0000000000000001 RSI: 000fffffffe00000 RDI: ffff91d5be99ea80
[Sun Jul 14 16:51:38 2024] RBP: ffff91d5c44fbe70 R08: 00007f2e5ff32000 R09: 0000000000000001
[Sun Jul 14 16:51:38 2024] R10: ffff91d5b7ad6d1c R11: 00007f2e5ff35fff R12: 00007f2e5ff32000
[Sun Jul 14 16:51:38 2024] R13: 0000000000000000 R14: ffffae5b4ab1fde8 R15: ffff91d5c44fbe70
[Sun Jul 14 16:51:38 2024] FS:  00007f2e5ff59dc0(0000) GS:ffff91d84ec80000(0000) knlGS:0000000000000000
[Sun Jul 14 16:51:38 2024] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Sun Jul 14 16:51:38 2024] CR2: 00007fe71316b08c CR3: 000000018468e000 CR4: 0000000000350ef0
[Sun Jul 14 16:51:38 2024] Call Trace:
[Sun Jul 14 16:51:38 2024]  <TASK>
[Sun Jul 14 16:51:38 2024]  ? __warn+0x7c/0x120
[Sun Jul 14 16:51:38 2024]  ? untrack_pfn+0x10c/0x120
[Sun Jul 14 16:51:38 2024]  ? report_bug+0x18d/0x1c0
[Sun Jul 14 16:51:38 2024]  ? handle_bug+0x3c/0x80
[Sun Jul 14 16:51:38 2024]  ? exc_invalid_op+0x13/0x60
[Sun Jul 14 16:51:38 2024]  ? asm_exc_invalid_op+0x16/0x20
[Sun Jul 14 16:51:38 2024]  ? untrack_pfn+0x10c/0x120
[Sun Jul 14 16:51:38 2024]  ? untrack_pfn+0x53/0x120
[Sun Jul 14 16:51:38 2024]  unmap_vmas+0x115/0x1a0
[Sun Jul 14 16:51:38 2024]  unmap_region+0xd4/0x150
[Sun Jul 14 16:51:38 2024]  ? mas_nomem+0x14/0x80
[Sun Jul 14 16:51:38 2024]  ? srso_return_thunk+0x5/0x5f
[Sun Jul 14 16:51:38 2024]  ? mas_store_gfp+0x54/0x110
[Sun Jul 14 16:51:38 2024]  do_vmi_align_munmap+0x2d4/0x530
[Sun Jul 14 16:51:38 2024]  do_vmi_munmap+0xda/0x190
[Sun Jul 14 16:51:38 2024]  __vm_munmap+0xa0/0x160
[Sun Jul 14 16:51:38 2024]  __x64_sys_munmap+0x17/0x20
[Sun Jul 14 16:51:38 2024]  do_syscall_64+0x4b/0x110
[Sun Jul 14 16:51:38 2024]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[Sun Jul 14 16:51:38 2024] RIP: 0033:0x7f2e647208f7
[Sun Jul 14 16:51:38 2024] Code: 00 00 00 48 8b 15 09 05 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 b8 0b 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d d9 04 0d 00 f7 d8 64 89 01 48
[Sun Jul 14 16:51:38 2024] RSP: 002b:00007ffd289f0a48 EFLAGS: 00000246 ORIG_RAX: 000000000000000b
[Sun Jul 14 16:51:38 2024] RAX: ffffffffffffffda RBX: 00007f2e5ff31000 RCX: 00007f2e647208f7
[Sun Jul 14 16:51:38 2024] RDX: 0000000000000000 RSI: 0000000000001000 RDI: 00007f2e5ff31000
[Sun Jul 14 16:51:38 2024] RBP: 0000557d5a9330a0 R08: 00000000c1d00028 R09: 00000000beef0100
[Sun Jul 14 16:51:38 2024] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
[Sun Jul 14 16:51:38 2024] R13: 0000000000000001 R14: 0000000000000002 R15: 0000557d5a8408c0
[Sun Jul 14 16:51:38 2024]  </TASK>
[Sun Jul 14 16:51:38 2024] ---[ end trace 0000000000000000 ]---
[Sun Jul 14 16:51:39 2024] ------------[ cut here ]------------
[Sun Jul 14 16:51:39 2024] WARNING: CPU: 1 PID: 2272 at arch/x86/mm/pat/memtype.c:1002 track_pfn_copy+0x94/0xa0
[Sun Jul 14 16:51:39 2024] Modules linked in: snd_seq_dummy(E) snd_hrtimer(E) snd_seq(E) ctr(E) ccm(E) nf_conntrack_netlink(E) xfrm_user(E) xfrm_algo(E) xt_addrtype(E) br_netfilter(E) xt_CHECKSUM(E) xt_MASQUERADE(E) xt_conntrack(E) ipt_REJECT(E) nf_reject_ipv4(E) xt_tcpudp(E) nft_compat(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) nf_tables(E) nfnetlink(E) bridge(E) stp(E) llc(E) overlay(E) binfmt_misc(E) nls_ascii(E) nls_cp437(E) vfat(E) fat(E) amd_atl(E) intel_rapl_msr(E) intel_rapl_common(E) nvidia_drm(POE) nvidia_modeset(POE) edac_mce_amd(E) kvm_amd(E) snd_hda_codec_realtek(E) kvm(E) iwlmvm(E) snd_hda_codec_generic(E) crct10dif_pclmul(E) snd_hda_scodec_component(E) snd_hda_codec_hdmi(E) ghash_clmulni_intel(E) sha512_ssse3(E) mac80211(E) sha512_generic(E) snd_hda_intel(E) nvidia(POE) sha256_ssse3(E) snd_intel_dspcfg(E) ppdev(E) sha1_ssse3(E) libarc4(E) snd_hda_codec(E) snd_usb_audio(E) snd_usbmidi_lib(E) uvcvideo(E) snd_hda_core(E) iwlwifi(E) aesni_intel(E) snd_rawmidi(E) snd_pcsp(E)
[Sun Jul 14 16:51:39 2024]  snd_hwdep(E) snd_seq_device(E) crypto_simd(E) videobuf2_vmalloc(E) snd_pcm(E) cryptd(E) uvc(E) videobuf2_memops(E) videobuf2_v4l2(E) snd_timer(E) rapl(E) cfg80211(E) k10temp(E) wmi_bmof(E) sp5100_tco(E) acpi_cpufreq(E) ccp(E) snd(E) videodev(E) drm_kms_helper(E) videobuf2_common(E) rfkill(E) video(E) rng_core(E) mc(E) soundcore(E) joydev(E) parport_pc(E) parport(E) sg(E) evdev(E) msr(E) loop(E) fuse(E) drm(E) efi_pstore(E) dm_mod(E) configfs(E) ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc16(E) mbcache(E) jbd2(E) btrfs(E) blake2b_generic(E) efivarfs(E) raid10(E) raid456(E) async_raid6_recov(E) async_memcpy(E) async_pq(E) async_xor(E) async_tx(E) raid1(E) raid0(E) md_mod(E) hid_generic(E) usbhid(E) hid(E) sd_mod(E) ahci(E) libahci(E) xhci_pci(E) nvme(E) libata(E) crc32_pclmul(E) nvme_core(E) xhci_hcd(E) t10_pi(E) crc32c_intel(E) i2c_piix4(E) r8169(E) crc64_rocksoft(E) realtek(E) scsi_mod(E) usbcore(E) scsi_common(E) usb_common(E) wmi(E) gpio_amdpt(E) gpio_generic(E) button(E)
[Sun Jul 14 16:51:39 2024] CPU: 1 PID: 2272 Comm: Xorg Tainted: P        W  OE      6.10.0-rc7-linan-1 #283
[Sun Jul 14 16:51:39 2024] Hardware name: Micro-Star International Co., Ltd. MS-7B89/B450M MORTAR MAX (MS-7B89), BIOS 2.80 06/10/2020
[Sun Jul 14 16:51:39 2024] RIP: 0010:track_pfn_copy+0x94/0xa0
[Sun Jul 14 16:51:39 2024] Code: ff ff ff eb b4 48 89 ee 48 8b 44 24 10 48 8b 3c 24 b9 01 00 00 00 4c 29 e6 48 8d 54 24 08 48 89 44 24 08 e8 fe fc ff ff eb 8f <0f> 0b eb d0 e8 73 b9 bb 00 0f 1f 00 90 90 90 90 90 90 90 90 90 90
[Sun Jul 14 16:51:39 2024] RSP: 0018:ffffae5b4a04fb68 EFLAGS: 00010202
[Sun Jul 14 16:51:39 2024] RAX: 0000000000000028 RBX: ffff91d546ae1d10 RCX: 0000000000000000
[Sun Jul 14 16:51:39 2024] RDX: 0000000000000001 RSI: 000fffffffe00000 RDI: ffff91d5b969c700
[Sun Jul 14 16:51:39 2024] RBP: 00007fe71316e000 R08: ffff91d639b0b9a0 R09: 00007fe71316e000
[Sun Jul 14 16:51:39 2024] R10: 00007fe71316dfff R11: 00007fe71316efff R12: 00007fe71316d000
[Sun Jul 14 16:51:39 2024] R13: ffff91d543702f40 R14: ffff91d639b0b9a0 R15: 00007fe71316e000
[Sun Jul 14 16:51:39 2024] FS:  00007fe7124f8ac0(0000) GS:ffff91d84ec80000(0000) knlGS:0000000000000000
[Sun Jul 14 16:51:39 2024] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Sun Jul 14 16:51:39 2024] CR2: 000055d6ab0453c0 CR3: 0000000179626000 CR4: 0000000000350ef0
[Sun Jul 14 16:51:39 2024] Call Trace:
[Sun Jul 14 16:51:39 2024]  <TASK>
[Sun Jul 14 16:51:39 2024]  ? __warn+0x7c/0x120
[Sun Jul 14 16:51:39 2024]  ? track_pfn_copy+0x94/0xa0
[Sun Jul 14 16:51:39 2024]  ? report_bug+0x18d/0x1c0
[Sun Jul 14 16:51:39 2024]  ? handle_bug+0x3c/0x80
[Sun Jul 14 16:51:39 2024]  ? exc_invalid_op+0x13/0x60
[Sun Jul 14 16:51:39 2024]  ? asm_exc_invalid_op+0x16/0x20
[Sun Jul 14 16:51:39 2024]  ? track_pfn_copy+0x94/0xa0
[Sun Jul 14 16:51:39 2024]  ? track_pfn_copy+0x57/0xa0
[Sun Jul 14 16:51:39 2024]  ? percpu_counter_add_batch+0x2e/0xa0
[Sun Jul 14 16:51:39 2024]  copy_page_range+0x156e/0x1630
[Sun Jul 14 16:51:39 2024]  ? srso_return_thunk+0x5/0x5f
[Sun Jul 14 16:51:39 2024]  ? mod_objcg_state+0xc9/0x2d0
[Sun Jul 14 16:51:39 2024]  ? obj_cgroup_charge+0x13f/0x1c0
[Sun Jul 14 16:51:39 2024]  ? __memcg_slab_post_alloc_hook+0x201/0x380
[Sun Jul 14 16:51:39 2024]  ? srso_return_thunk+0x5/0x5f
[Sun Jul 14 16:51:39 2024]  copy_process+0x1500/0x26e0
[Sun Jul 14 16:51:39 2024]  kernel_clone+0x97/0x3a0
[Sun Jul 14 16:51:39 2024]  ? srso_return_thunk+0x5/0x5f
[Sun Jul 14 16:51:39 2024]  ? preempt_count_add+0x69/0xa0
[Sun Jul 14 16:51:39 2024]  __do_sys_clone+0x66/0x90
[Sun Jul 14 16:51:39 2024]  do_syscall_64+0x4b/0x110
[Sun Jul 14 16:51:39 2024]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
[Sun Jul 14 16:51:39 2024] RIP: 0033:0x7fe712b3a293
[Sun Jul 14 16:51:39 2024] Code: 00 00 00 00 00 66 90 64 48 8b 04 25 10 00 00 00 45 31 c0 31 d2 31 f6 bf 11 00 20 01 4c 8d 90 d0 02 00 00 b8 38 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 89 c2 85 c0 75 2c 64 48 8b 04 25 10 00 00
[Sun Jul 14 16:51:39 2024] RSP: 002b:00007ffff7e77a98 EFLAGS: 00000246 ORIG_RAX: 0000000000000038
[Sun Jul 14 16:51:39 2024] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fe712b3a293
[Sun Jul 14 16:51:39 2024] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011
[Sun Jul 14 16:51:39 2024] RBP: 0000000000000001 R08: 0000000000000000 R09: 000055dacf937b10
[Sun Jul 14 16:51:39 2024] R10: 00007fe7124f8d90 R11: 0000000000000246 R12: 0000000000000000
[Sun Jul 14 16:51:39 2024] R13: 00007ffff7e77bb0 R14: 00007ffff7e77aa0 R15: 000055daa65272a0
[Sun Jul 14 16:51:39 2024]  </TASK>
[Sun Jul 14 16:51:39 2024] ---[ end trace 0000000000000000 ]---
[Sun Jul 14 16:51:39 2024] rfkill: input handler disabled
[Sun Jul 14 16:51:39 2024] Generic FE-GE Realtek PHY r8169-0-2200:00: attached PHY driver (mii_bus:phy_addr=r8169-0-2200:00, irq=MAC)






(Do not know a sure precedure to reproduce it yet....)



FYI

David