possible stack corruption in icmp_send (__stack_chk_fail)
From: Jason A. Donenfeld
Date: Wed Feb 17 2021 - 13:14:46 EST
Hi Netdev & Willem,
I've received a report of stack corruption -- via the stack protector
check -- in icmp_send. I was sent a vmcore, and was able to extract
the OOPS from there. However, I've been unable to produce the bug and
I don't see where it'd be in the code. That might point to a more
sinister problem, or I'm simply just not seeing it. Apparently the
reporter reproduces it every 40 or so minutes, and has seen it happen
since at least ~5.10. Willem - I'm emailing you because it seems like
you were making a lot of changes to the icmp code around then, and
perhaps you have an intuition. For example, some of the error handling
code takes a pointer to a stack buffer (_objh and such), and maybe
that's problematic? I'm not quite sure. The vmcore, along with the
various kernel binaries I hunted down are here:
https://data.zx2c4.com/icmp_send-crash-e03b4a42-706a-43bf-bc40-1f15966b3216.tar.xz
. The extracted dmesg follows below, in case you or anyone has a
pointer. I've been staring at this for a while and don't see it.
Jason
Kernel panic - not syncing: stack-protector: Kernel stack is corrupted
in: __icmp_send+0x5bd/0x5c0
CPU: 0 PID: 959 Comm: kworker/0:2 Kdump: loaded Not tainted
5.11.0-051100-lowlatency #202102142330
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.13.0-48-gd9c812dda519-prebuilt.qemu.org 04/01/2014
Workqueue: wg-crypt-wg0 wg_packet_decrypt_worker [wireguard]
Call Trace:
<IRQ>
show_stack+0x52/0x58
dump_stack+0x70/0x8b
panic+0x108/0x2ea
? ip_push_pending_frames+0x42/0x90
? __icmp_send+0x5bd/0x5c0
__stack_chk_fail+0x14/0x20
__icmp_send+0x5bd/0x5c0
icmp_ndo_send+0x148/0x160
wg_xmit+0x359/0x450 [wireguard]
? harmonize_features+0x19/0x80
xmit_one.constprop.0+0x9f/0x190
dev_hard_start_xmit+0x43/0x90
sch_direct_xmit+0x11d/0x340
__qdisc_run+0x66/0xc0
__dev_xmit_skb+0xd5/0x340
__dev_queue_xmit+0x32b/0x4d0
? nf_conntrack_double_lock.constprop.0+0x97/0x140 [nf_conntrack]
dev_queue_xmit+0x10/0x20
neigh_connected_output+0xcb/0xf0
ip_finish_output2+0x17f/0x470
__ip_finish_output+0x9b/0x140
? ipv4_confirm+0x4a/0x80 [nf_conntrack]
ip_finish_output+0x2d/0xb0
ip_output+0x78/0x110
? __ip_finish_output+0x140/0x140
ip_forward_finish+0x58/0x90
ip_forward+0x40a/0x4d0
? ip4_key_hashfn+0xb0/0xb0
ip_sublist_rcv_finish+0x3d/0x50
ip_list_rcv_finish.constprop.0+0x163/0x190
ip_sublist_rcv+0x37/0xb0
? ip_rcv_finish_core.constprop.0+0x310/0x310
ip_list_rcv+0xf5/0x120
__netif_receive_skb_list_core+0x228/0x250
__netif_receive_skb_list+0x102/0x170
? dev_gro_receive+0x1b5/0x370
netif_receive_skb_list_internal+0xca/0x190
napi_complete_done+0x7a/0x1a0
wg_packet_rx_poll+0x384/0x400 [wireguard]
napi_poll+0x92/0x200
net_rx_action+0xb8/0x1c0
__do_softirq+0xce/0x2b3
asm_call_irq_on_stack+0x12/0x20
</IRQ>
do_softirq_own_stack+0x3d/0x50
do_softirq+0x66/0x80
__local_bh_enable_ip+0x62/0x70
_raw_spin_unlock_bh+0x1e/0x20
wg_packet_decrypt_worker+0xf6/0x190 [wireguard]
process_one_work+0x217/0x3e0
worker_thread+0x4d/0x350
? rescuer_thread+0x390/0x390
kthread+0x145/0x170
? __kthread_bind_mask+0x70/0x70
ret_from_fork+0x22/0x30
Kernel Offset: 0x2000000 from 0xffffffff81000000 (relocation range:
0xffffffff80000000-0xffffffffbfffffff)