Re: [PATCH v2] bpf, sockmap: keep sk_msg copy state in sync
From: Jiayuan Chen
Date: Tue May 19 2026 - 23:20:08 EST
On 5/17/26 8:16 PM, Zhang Cen wrote:
SK_MSG uses msg->sg.copy as per-scatterlist-entry provenance. Entries
with this bit set are copied before data/data_end are exposed to SK_MSG
BPF programs for direct packet access.
bpf_msg_pull_data(), bpf_msg_push_data() and bpf_msg_pop_data() rewrite
the sk_msg scatterlist ring by collapsing, splitting and shifting
entries. These operations move msg->sg.data[] entries, but the parallel
copy bitmap can be left behind or stale in slots that no longer contain
the original entry. A copied entry can therefore later occupy a slot whose
copy bit is clear and be exposed as directly writable packet data.
Keep msg->sg.copy synchronized with scatterlist entry moves, preserve the
copy bit when an entry is split, clear it when a helper replaces an entry
with a private page, and clear every slot vacated by pull-data
compaction.
Fixes: 015632bb30da ("bpf: sk_msg program helper bpf_sk_msg_pull_data")
Fixes: 6fff607e2f14 ("bpf: sk_msg program helper bpf_msg_push_data")
Fixes: 7246d8ed4dcc ("bpf: helper to pop data from messages")
Cc: stable@xxxxxxxxxxxxxxx
Co-developed-by: Han Guidong <2045gemini@xxxxxxxxx>
Signed-off-by: Han Guidong <2045gemini@xxxxxxxxx>
Signed-off-by: Zhang Cen <rollkingzzc@xxxxxxxxx>
---
v2:
Sashiko-bot pointed out that bpf_msg_pull_data() could leave stale copy
bits on collapsed tail entries.
Clear msg->sg.copy for every entry consumed by bpf_msg_pull_data()
before compacting the scatterlist ring.
While researching recent page cache bugs, we discovered this bug.
We confirmed it allows overwriting the page cache of read-only files
via splice(). We haven't attempted to write an exploit, but the
corruption primitive is verified. PoC available upon request.
Recommend fixing ASAP.
I think only "splice() + KTLS + sockmap" is vulnerable, right ?
I digded a lot but didn't find any other combo.
Actually the normal TCP/UDP with splice() will not go through sockmap (unsupported yet)
I think only "splice() + KTLS + sockmap" is vulnerable, right ?
I digded a lot but didn't find any other combo.