Re: Kernel 4.17.4 lockup

From: H.J. Lu
Date: Thu Jul 12 2018 - 10:44:06 EST


On Wed, Jul 11, 2018 at 4:14 PM, Dave Hansen <dave.hansen@xxxxxxxxx> wrote:
> On 07/11/2018 04:07 PM, Andy Lutomirski wrote:
>> Could the cause be an overflow of the IRQ stack? Iâve been meaning
>> to put guard pages on all the special stacks for a while. Let me see
>> if I can do that in the next couple days.
>
> But what would that overflow into? Wouldn't it most likely be another
> interrupt stack since they're all allocated together?
>
> This looks more like thread stack corruption.

I tried netconsole and got this:

[29369.552998] ------------[ cut here ]------------
[29369.560996] kernel BUG at mm/page_alloc.c:2019!
[29369.568980] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[29369.576892] Modules linked in: netconsole xt_CHECKSUM
ipt_MASQUERADE nf_nat_masquerade_ipv4 tun nf_conntrack_netbios_ns
nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6
xt_conntrack devlink ip_set nfnetlink ebtable_nat ebtable_broute
bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6
nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack
libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter
ebtables ip6table_filter ip6_tables sunrpc vfat fat intel_powerclamp
coretemp kvm_intel kvm irqbypass intel_cstate intel_uncore
snd_hda_codec_realtek snd_hda_codec_generic iTCO_wdt
iTCO_vendor_support gpio_ich snd_hda_intel joydev snd_hda_codec
snd_hda_core snd_hwdep mxm_wmi snd_seq snd_seq_device
[29369.627745] snd_pcm pcspkr snd_timer snd i2c_i801 soundcore
lpc_ich i5500_temp i7core_edac shpchp wmi acpi_cpufreq ata_generic
pata_acpi radeon crc32c_intel i2c_algo_bit drm_kms_helper
firewire_ohci firewire_core ttm crc_itu_t drm e1000e pata_marvell
[29369.645472] CPU: 1 PID: 3896 Comm: expect Tainted: G I
4.17.5+ #7
[29369.654333] Hardware name: /DX58SO, BIOS
SOX5810J.86A.5600.2013.0729.2250 07/29/2013
[29369.663320] RIP: 0010:move_freepages_block+0x246/0x4b0
[29369.672238] RSP: 0018:ffff8800b61f7178 EFLAGS: 00010002
[29369.681064] RAX: ffff8801af3d7000 RBX: ffffea00033c8000 RCX: 0000000000000000
[29369.690011] RDX: dffffc0000000000 RSI: ffffea00033cc000 RDI: ffffffff831d8ec0
[29369.698992] RBP: ffff8801af3d7680 R08: ffff8800b61f73c8 R09: ffffed0035e7af78
[29369.708025] R10: ffffed0035e7af78 R11: ffff8801af3d7bc3 R12: ffff8800b61f7228

before machine looked up.

--
H.J.