Re: Machine check recovery broken in v6.9-rc1

From: Tony Luck
Date: Fri Apr 05 2024 - 19:59:28 EST


On Fri, Apr 05, 2024 at 03:05:17PM +0000, Luck, Tony wrote:
> >> kernel BUG at include/linux/swapops.h:119
> >>
> >> 117 static inline unsigned long swp_offset_pfn(swp_entry_t entry)
> >> 118 {
> >> 119 VM_BUG_ON(!is_pfn_swap_entry(entry));
> >> 120 return swp_offset(entry) & SWP_PFN_MASK;
> >> 121 }
> >>
> >
> > Do you have the full backtrace?
>
> It's very garbled because multiple CPUs seem to be logging to the console together.
>
> I've also got some more details.
>
> 1) Problems with recovery go back further than v6.8. Still looking to pin down where it started.
> v6.0 seems good (can run hundreds of injections + consumptions).
> 2) It's connected to a BIOS setting that enables APEI logging. If I disable that, current
> kernels don't have problems. v6.0 runs OK with this setting on or off.
>
> I'm bisecting to find where the APEI issue begins.

So a problem (not necessarily the current upstream problem given the
amount of additional changes that happened) began way back in this
commit from v6.1:

0d206b5d2e0d ("mm/swap: add swp_offset_pfn() to fetch PFN from swap entry")

This commit gives the "kernel BUG at include/linux/swapops.h:88!" while
its predecessor runs without errors).

The test case simply injects an uncorrectable memory error to DRAM, then
reads that location to consume the poison and generate a machine check.

memory_failure() must have been called multiple times (see the "already hardware
poisoned" message). Twice is pretty common. Intel eystems can report an
error twice. Once by the memory controller when it sees the ECC bits
don't add up. Second time by the CPU core when trying to consume the
poisoned data. But I can't explain four calls.

I unloaded the acpi_extlog and i10nm_edac modules before running the
test to get a less jumbled console log.


[ 159.345790] EINJ: Error INJection is initialized.
[ 183.645231] Disabling lock debugging due to kernel taint
[ 183.645243] mce: [Hardware Error]: Machine check events logged
[ 183.645940] {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 0
[ 183.645945] {1}[Hardware Error]: event severity: recoverable
[ 183.645947] {1}[Hardware Error]: Error 0, type: recoverable
[ 183.645949] {1}[Hardware Error]: fru_text: Card02, ChnA, DIMM0
[ 183.645951] {1}[Hardware Error]: section_type: memory error
[ 183.645953] {1}[Hardware Error]: error_status: Storage error in DRAM memory (0x0000000000000400)
[ 183.645955] {1}[Hardware Error]: physical_address: 0x00000008a398a400
[ 183.645958] {1}[Hardware Error]: node:1 card:0 module:0 rank:0 bank:2 device:0 row:1136 column:584
[ 183.645959] {1}[Hardware Error]: error_type: 14, scrub uncorrected error
[ 183.645961] {1}[Hardware Error]: DIMM location: NODE 4 CPU1_DIMM_A1
[ 183.645962] {1}[Hardware Error]: Error 1, type: recoverable
[ 183.645963] {1}[Hardware Error]: section_type: memory error
[ 183.645964] {1}[Hardware Error]: error_status: Storage error in DRAM memory (0x0000000000000400)
[ 183.645965] {1}[Hardware Error]: physical_address: 0x00000008a398a400
[ 183.645965] {1}[Hardware Error]: node:1
[ 183.645966] {1}[Hardware Error]: error_type: 3, multi-bit ECC
[ 183.650625] mce: Uncorrected hardware memory error in user-access at 8a398a400
[ 183.763844] Memory failure: 0x8a398a: recovery action for dirty LRU page: Recovered
[ 183.778756] Memory failure: 0x8a398a: already hardware poisoned
[ 183.784695] Memory failure: 0x8a398a: already hardware poisoned
[ 183.790657] mce: [Hardware Error]: Machine check events logged
[ 183.790746] Memory failure: 0x8a398a: already hardware poisoned
[ 183.802491] ------------[ cut here ]------------
[ 183.807126] kernel BUG at include/linux/swapops.h:88!
[ 183.812199] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[ 183.817436] CPU: 139 PID: 4501 Comm: einj_mem_uc Tainted: G M 6.0.0-rc3+ #443
[ 183.825961] Hardware name: Intel Corporation WilsonCity/WilsonCity, BIOS WLYDCRB1.SYS.0021.P06.2104260458 04/26/2021
[ 183.836476] RIP: 0010:check_hwpoisoned_entry+0xba/0xc0
[ 183.841632] Code: 31 c0 c3 cc cc cc cc 48 83 e7 fd 41 89 c3 41 21 fb eb ce 48 83 e7 f7 41 89 c3 41 21 fb 41 f6 c3 02 74 be eb e2 0f 1f 44 00 00 <0f> 0b 0f 1f 40 00 0f 1f 44 00 00 4c 89 c0 48 89 d6 4d 8b 40 28 48
[ 183.860382] RSP: 0000:ff500c69ca963c90 EFLAGS: 00010246
[ 183.865608] RAX: 0000000000000016 RBX: 00007f8fb2a69000 RCX: 00000000008a398a
[ 183.872741] RDX: 0000000000000000 RSI: 00007f8fb2a69000 RDI: b7fffffeeb8cea00
[ 183.879874] RBP: ff486f47df6f9ca8 R08: ff500c69ca963e10 R09: 0000000000000000
[ 183.887004] R10: 000000000000000c R11: 0000000000000000 R12: ff486f47d17ca348
[ 183.894139] R13: 00007f8fb2a6a000 R14: ff486f48010c8b00 R15: ff500c69ca963e10
[ 183.901278] FS: 00007f8fb2874740(0000) GS:ff486f4f9fac0000(0000) knlGS:0000000000000000
[ 183.909362] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 183.915110] CR2: 00000000016b02f8 CR3: 00000008b81ce003 CR4: 0000000000771ee0
[ 183.922240] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 183.929373] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 183.936505] PKRU: 55555554
[ 183.939218] Call Trace:
[ 183.941670] <TASK>
[ 183.943776] hwpoison_pte_range+0x2ad/0x300
[ 183.947963] __walk_page_range+0x5f7/0xc80
[ 183.952072] ? tick_nohz_tick_stopped+0x12/0x30
[ 183.956609] ? rcu_read_lock_sched_held+0x10/0x80
[ 183.961327] walk_page_range+0xc6/0x180
[ 183.965173] kill_accessing_process+0xca/0x13d
[ 183.969628] memory_failure.cold+0x9a/0xbc
[ 183.973737] kill_me_maybe+0x3b/0x9a
[ 183.977324] task_work_run+0x59/0x90
[ 183.980910] exit_to_user_mode_prepare+0x22c/0x250
[ 183.985710] irqentry_exit_to_user_mode+0x5/0x30
[ 183.990340] noist_exc_machine_check+0x55/0xa0
[ 183.994792] asm_exc_machine_check+0x2f/0x40
[ 183.999073] RIP: 0033:0x405bff
[ 184.002140] Code: 00 e8 a5 c5 ff ff 55 53 48 83 ec 08 48 63 05 40 5d 00 00 48 8d 2c 07 bf 10 80 40 00 e8 ba c4 ff ff b8 00 00 00 00 89 c1 0f a2 <0f> be 45 00 48 83 c4 08 5b 5d c3 53 48 83 ec 20 48 c7 44 24 18 00
[ 184.020886] RSP: 002b:00007ffcfa807e80 EFLAGS: 00010206
[ 184.026111] RAX: 000000000000001b RBX: 00000000756e6547 RCX: 000000006c65746e
[ 184.033241] RDX: 0000000049656e69 RSI: 00000000016ae4a0 RDI: 00007f8fb2a4ea30
[ 184.040376] RBP: 00007f8fb2a69400 R08: 00000000016ae2a0 R09: 0000000000000001
[ 184.047508] R10: 00007ffcfa807e60 R11: 0000000000000202 R12: 0000000000000003
[ 184.054639] R13: 0000000000000000 R14: 00007f8fb2a9d000 R15: 000000000040ae00
[ 184.061774] </TASK>
[ 184.063973] Modules linked in: einj snd_seq_dummy snd_hrtimer snd_seq snd_seq_device snd_timer snd soundcore nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables nfnetlink qrtr intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp ipmi_ssif sunrpc kvm_intel binfmt_misc kvm irqbypass vfat rapl fat intel_cstate intel_uncore pcspkr isst_if_mbox_pci isst_if_mmio isst_if_common i2c_i801 i2c_smbus mei_me acpi_ipmi mei intel_pch_thermal intel_vsec ioatdma ipmi_si ipmi_devintf joydev ipmi_msghandler loop zram crct10dif_pclmul ast crc32_pclmul crc32c_intel polyval_clmulni drm_vram_helper polyval_generic ixgbe drm_ttm_helper igb mdio ttm ghash_clmulni_intel dca wmi ip6_tables ip_tables fuse
[ 184.064048] Unloaded tainted modules: i10nm_edac():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1
[ 184.149160] acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1
[ 184.236447] acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1
[ 184.323640] acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1
[ 184.410309] acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1
[ 184.497416] acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1
[ 184.584254] pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 pcc_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1
[ 184.671099] pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 fjes():1 acpi_cpufreq():1 pcc_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 fjes():1 acpi_cpufreq():1 pcc_cpufreq():1 fjes():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 pcc_cpufreq():1 acpi_cpufreq():1 acpi_cpufreq():1 fjes():1 fjes():1 fjes():1 [last unloaded: i10nm_edac]
[ 184.796172] ---[ end trace 0000000000000000 ]---
[ 184.806944] RIP: 0010:check_hwpoisoned_entry+0xba/0xc0
[ 184.812107] Code: 31 c0 c3 cc cc cc cc 48 83 e7 fd 41 89 c3 41 21 fb eb ce 48 83 e7 f7 41 89 c3 41 21 fb 41 f6 c3 02 74 be eb e2 0f 1f 44 00 00 <0f> 0b 0f 1f 40 00 0f 1f 44 00 00 4c 89 c0 48 89 d6 4d 8b 40 28 48
[ 184.830868] RSP: 0000:ff500c69ca963c90 EFLAGS: 00010246
[ 184.836111] RAX: 0000000000000016 RBX: 00007f8fb2a69000 RCX: 00000000008a398a
[ 184.843261] RDX: 0000000000000000 RSI: 00007f8fb2a69000 RDI: b7fffffeeb8cea00
[ 184.850410] RBP: ff486f47df6f9ca8 R08: ff500c69ca963e10 R09: 0000000000000000
[ 184.857560] R10: 000000000000000c R11: 0000000000000000 R12: ff486f47d17ca348
[ 184.864709] R13: 00007f8fb2a6a000 R14: ff486f48010c8b00 R15: ff500c69ca963e10
[ 184.871855] FS: 00007f8fb2874740(0000) GS:ff486f4f9fac0000(0000) knlGS:0000000000000000
[ 184.879950] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 184.885704] CR2: 00000000016b02f8 CR3: 00000008b81ce003 CR4: 0000000000771ee0
[ 184.892847] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 184.899995] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 184.907135] PKRU: 55555554
[ 184.909857] note: einj_mem_uc[4501] exited with preempt_count 1