[PATCH ipsec] xfrm: avoid using skb->mac_len to decide if mac header is shown

From: En-Wei Wu
Date: Wed Sep 11 2024 - 23:38:30 EST


When we use Intel WWAN with xfrm, our system always hangs after
browsing websites for a few seconds. The error message shows that
it is a slab-out-of-bounds error:

[ 67.162014] BUG: KASAN: slab-out-of-bounds in xfrm_input+0x426e/0x6740
[ 67.162030] Write of size 2 at addr ffff888156cb814b by task ksoftirqd/2/26

[ 67.162043] CPU: 2 UID: 0 PID: 26 Comm: ksoftirqd/2 Not tainted 6.11.0-rc6-c763c4339688+ #2
[ 67.162053] Hardware name: Dell Inc. Latitude 5340/0SG010, BIOS 1.15.0 07/15/2024
[ 67.162058] Call Trace:
[ 67.162062] <TASK>
[ 67.162068] dump_stack_lvl+0x76/0xa0
[ 67.162079] print_report+0xce/0x5f0
[ 67.162088] ? xfrm_input+0x426e/0x6740
[ 67.162096] ? kasan_complete_mode_report_info+0x26/0x200
[ 67.162105] ? xfrm_input+0x426e/0x6740
[ 67.162112] kasan_report+0xbe/0x110
[ 67.162119] ? xfrm_input+0x426e/0x6740
[ 67.162129] __asan_report_store_n_noabort+0x12/0x30
[ 67.162138] xfrm_input+0x426e/0x6740
[ 67.162149] ? __pfx_xfrm_input+0x10/0x10
[ 67.162160] ? __kasan_check_read+0x11/0x20
[ 67.162168] ? __call_rcu_common+0x3e7/0x15b0
[ 67.162178] xfrm4_rcv_encap+0x214/0x470
[ 67.162186] ? __xfrm4_udp_encap_rcv.part.0+0x3cd/0x560
[ 67.162195] xfrm4_udp_encap_rcv+0xdd/0xf0
[ 67.162203] udp_queue_rcv_one_skb+0x880/0x12f0
[ 67.162212] udp_queue_rcv_skb+0x139/0xa90
[ 67.162221] udp_unicast_rcv_skb+0x116/0x350
[ 67.162229] __udp4_lib_rcv+0x213b/0x3410
[ 67.162237] ? ldsem_down_write+0x211/0x4ed
[ 67.162246] ? __pfx___udp4_lib_rcv+0x10/0x10
[ 67.162254] ? __pfx_raw_local_deliver+0x10/0x10
[ 67.162262] ? __pfx_cache_tag_flush_range_np+0x10/0x10
[ 67.162273] udp_rcv+0x86/0xb0
[ 67.162280] ip_protocol_deliver_rcu+0x152/0x380
[ 67.162289] ip_local_deliver_finish+0x282/0x370
[ 67.162296] ip_local_deliver+0x1a8/0x380
[ 67.162303] ? __pfx_ip_local_deliver+0x10/0x10
[ 67.162310] ? ip_rcv_finish_core.constprop.0+0x481/0x1ce0
[ 67.162317] ? ip_rcv_core+0x5df/0xd60
[ 67.162325] ip_rcv+0x2fc/0x380
[ 67.162332] ? __pfx_ip_rcv+0x10/0x10
[ 67.162338] ? __pfx_dma_map_page_attrs+0x10/0x10
[ 67.162346] ? __kasan_check_write+0x14/0x30
[ 67.162354] ? __build_skb_around+0x23a/0x350
[ 67.162363] ? __pfx_ip_rcv+0x10/0x10
[ 67.162369] __netif_receive_skb_one_core+0x173/0x1d0
[ 67.162377] ? __pfx___netif_receive_skb_one_core+0x10/0x10
[ 67.162386] ? __kasan_check_write+0x14/0x30
[ 67.162394] ? _raw_spin_lock_irq+0x8b/0x100
[ 67.162402] __netif_receive_skb+0x21/0x160
[ 67.162409] process_backlog+0x1c0/0x590
[ 67.162417] __napi_poll+0xab/0x550
[ 67.162425] net_rx_action+0x53e/0xd10
[ 67.162434] ? __pfx_net_rx_action+0x10/0x10
[ 67.162443] ? __pfx_wake_up_var+0x10/0x10
[ 67.162453] ? tasklet_action_common.constprop.0+0x22c/0x670
[ 67.162463] handle_softirqs+0x18f/0x5d0
[ 67.162472] ? __pfx_run_ksoftirqd+0x10/0x10
[ 67.162480] run_ksoftirqd+0x3c/0x60
[ 67.162487] smpboot_thread_fn+0x2f3/0x700
[ 67.162497] kthread+0x2b5/0x390
[ 67.162505] ? __pfx_smpboot_thread_fn+0x10/0x10
[ 67.162512] ? __pfx_kthread+0x10/0x10
[ 67.162519] ret_from_fork+0x43/0x90
[ 67.162527] ? __pfx_kthread+0x10/0x10
[ 67.162534] ret_from_fork_asm+0x1a/0x30
[ 67.162544] </TASK>

[ 67.162551] The buggy address belongs to the object at ffff888156cb8000
which belongs to the cache kmalloc-rnd-09-8k of size 8192
[ 67.162557] The buggy address is located 331 bytes inside of
allocated 8192-byte region [ffff888156cb8000, ffff888156cba000)

[ 67.162566] The buggy address belongs to the physical page:
[ 67.162570] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x156cb8
[ 67.162578] head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
[ 67.162583] flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
[ 67.162591] page_type: 0xfdffffff(slab)
[ 67.162599] raw: 0017ffffc0000040 ffff888100056780 dead000000000122 0000000000000000
[ 67.162605] raw: 0000000000000000 0000000080020002 00000001fdffffff 0000000000000000
[ 67.162611] head: 0017ffffc0000040 ffff888100056780 dead000000000122 0000000000000000
[ 67.162616] head: 0000000000000000 0000000080020002 00000001fdffffff 0000000000000000
[ 67.162621] head: 0017ffffc0000003 ffffea00055b2e01 ffffffffffffffff 0000000000000000
[ 67.162626] head: 0000000000000008 0000000000000000 00000000ffffffff 0000000000000000
[ 67.162630] page dumped because: kasan: bad access detected

[ 67.162636] Memory state around the buggy address:
[ 67.162640] ffff888156cb8000: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 67.162645] ffff888156cb8080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 67.162650] >ffff888156cb8100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 67.162653] ^
[ 67.162658] ffff888156cb8180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 67.162663] ffff888156cb8200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc

The reason is that the eth_hdr(skb) inside the if statement evaluated
to an unexpected address with skb->mac_header = ~0U (indicating there
is no MAC header). The unreliability of skb->mac_len causes the if
statement to become true even if there is no MAC header inside the
skb data buffer.

Replace the skb->mac_len in the if statement with the more reliable macro
skb_mac_header_was_set(skb) fixes this issue.

Fixes: b3284df1c86f ("xfrm: remove input2 indirection from xfrm_mode")
Signed-off-by: En-Wei Wu <en-wei.wu@xxxxxxxxxxxxx>
---
net/xfrm/xfrm_input.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
index 749e7eea99e4..93b261340105 100644
--- a/net/xfrm/xfrm_input.c
+++ b/net/xfrm/xfrm_input.c
@@ -251,7 +251,7 @@ static int xfrm4_remove_tunnel_encap(struct xfrm_state *x, struct sk_buff *skb)

skb_reset_network_header(skb);
skb_mac_header_rebuild(skb);
- if (skb->mac_len)
+ if (skb_mac_header_was_set(skb))
eth_hdr(skb)->h_proto = skb->protocol;

err = 0;
--
2.43.0