Re: [PATCH bpf-next v3 3/7] bpf, sockmap: zero-initialize pages allocated in bpf_msg_push_data

From: Alexei Starovoitov

Date: Fri Jun 12 2026 - 21:37:17 EST


On Fri Jun 12, 2026 at 5:28 PM PDT, Kuniyuki Iwashima wrote:
> From: Jiayuan Chen <jiayuan.chen@xxxxxxxxx>
> Date: Fri, 12 Jun 2026 21:07:47 +0800
>> From: Weiming Shi <bestswngs@xxxxxxxxx>
>>
>> bpf_msg_push_data() allocates pages via alloc_pages() without
>> __GFP_ZERO. In the non-copy path, the entire page of uninitialized
>> heap content is added directly to the sk_msg scatterlist, which is
>> then transmitted over TCP to userspace via tcp_bpf_push(). In the
>> copy path, a gap of len bytes between the front and back memcpy
>> regions is similarly left uninitialized.
>>
>> This leads to a kernel heap information leak: stale page content
>> including kernel pointers from the direct-map and vmemmap regions
>> is transmitted to userspace, which can be used to defeat KASLR.
>>
>> Add __GFP_ZERO to the alloc_pages() call to ensure the allocated
>> page is always zeroed before it enters the scatterlist.
>>
>> Link: https://lore.kernel.org/all/20260424155913.A19FDC19425@xxxxxxxxxxxxxxx
>> Fixes: 6fff607e2f14 ("bpf: sk_msg program helper bpf_msg_push_data")
>> Tested-by: Xiang Mei <xmei5@xxxxxxx>
>> Tested-by: Xinyu Ma <mmmxny@xxxxxxxxx>
>> Reviewed-by: Jiayuan Chen <jiayuan.chen@xxxxxxxxx>
>> Reviewed-by: Emil Tsalapatis <emil@xxxxxxxxxxxxxxx>
>> Signed-off-by: Weiming Shi <bestswngs@xxxxxxxxx>
>> Signed-off-by: Jiayuan Chen <jiayuan.chen@xxxxxxxxx>
>> ---
>> net/core/filter.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/net/core/filter.c b/net/core/filter.c
>> index 3e555f276ba80..6e345ca65ca14 100644
>> --- a/net/core/filter.c
>> +++ b/net/core/filter.c
>> @@ -2832,7 +2832,7 @@ BPF_CALL_4(bpf_msg_push_data, struct sk_msg *, msg, u32, start,
>> if (unlikely(copy + len < copy))
>> return -EINVAL;
>>
>> - page = alloc_pages(__GFP_NOWARN | GFP_ATOMIC | __GFP_COMP,
>> + page = alloc_pages(__GFP_NOWARN | GFP_ATOMIC | __GFP_COMP | __GFP_ZERO,
>
> This is a red flag.
>
> We have a bunch of KMSAN reports due to raw/packet sockets,
> which requires CAP_NET_ADMIN, and leave them unfixed although
> some people attempted to "fix" them by adding __GFP_ZERO to
> __alloc_skb().

yep. It's a bpf prog responsibility to avoid garbage in the payload.

pw-bot: cr