Re: [PATCH bpf v5 1/2] bpf: Fix OOB in pcpu_init_value

From: Alexei Starovoitov

Date: Thu Apr 02 2026 - 22:12:59 EST


On Thu, Apr 2, 2026 at 7:00 PM Martin KaFai Lau <martin.lau@xxxxxxxxx> wrote:
>
> On Thu, Apr 02, 2026 at 05:05:14PM -0700, Alexei Starovoitov wrote:
> > On Thu, Apr 2, 2026 at 12:59 PM Martin KaFai Lau <martin.lau@xxxxxxxxx> wrote:
> > >
> > > On Thu, Apr 02, 2026 at 11:36:04AM -0700, Alexei Starovoitov wrote:
> > > > On Thu, Apr 2, 2026 at 10:01 AM Martin KaFai Lau <martin.lau@xxxxxxxxx> wrote:
> > > > >
> > > > > On Thu, Apr 02, 2026 at 07:17:17AM -0700, Alexei Starovoitov wrote:
> > > > > > On Thu, Apr 2, 2026 at 12:43 AM xulang <xulang@xxxxxxxxxxxxx> wrote:
> > > > > > >
> > > > > > > From: Lang Xu <xulang@xxxxxxxxxxxxx>
> > > > > > >
> > > > > > > An out-of-bounds read occurs when copying element from a
> > > > > > > BPF_MAP_TYPE_CGROUP_STORAGE map to another pcpu map with the
> > > > > > > same value_size that is not rounded up to 8 bytes.
> > > > > > >
> > > > > > > The issue happens when:
> > > > > > > 1. A CGROUP_STORAGE map is created with value_size not aligned to
> > > > > > > 8 bytes (e.g., 4 bytes)
> > > > > > > 2. A pcpu map is created with the same value_size (e.g., 4 bytes)
> > > > > > > 3. Update element in 2 with data in 1
> > > > > > >
> > > > > > > pcpu_init_value assumes that all sources are rounded up to 8 bytes,
> > > > > > > and invokes copy_map_value_long to make a data copy, However, the
> > > > > > > assumption doesn't stand since there are some cases where the source
> > > > > > > may not be rounded up to 8 bytes, e.g., CGROUP_STORAGE,
> > > > > >
> > > > > > why? Just round it up there instead of penalizing perf everywhere.
> > > > > >
> > > > > > > skb->data.
> > > > > >
> > > > > > what that means?
> > > > > >
> > > > > > pcpu_init_value() can access skb->data ?
> > > > >
> > > > > After bound check, the skb->data can be used in
> > > > > bpf_map_update_elem(&percpu_lru_map, &key, skb_data, BPF_NOEXIST)
> > > > > which will call pcpu_init_value().
> > > >
> > > > I see, but if we round up on cgroup storage size the problem is gone,
> > > > right?
> > >
> > > Right, it will fix the problem tested in patch 2 which
> > > passes cgroup_storage_value as the source to
> > > pcpu_init_value(). The bug should only manifest with BPF_NOEXIST.
> > > For BPF_EXIST, pcpu_copy_value() will be used and it
> > > currently uses copy_map_value() instead of copy_map_value_long().
> > >
> > > > Doesn't matter what the source of the copy is.
> > >
> > > I think the source (PTR_TO_*) matters here because the bug is about
> > > reading beyond the boundary of the source. A few other map types
> > > were audited when their values were used as the source.
> > >
> > > For skb->data, using skb->data to reproduce is practically
> > > not possible because there should be at least shinfo beyond data_end,
> > > so some of shinfo may get copied to the pcpu map in the extreme case.
> >
> > Yes, but also the verifier checks that ptr + value_size is accessible
> > in that source. In this case, that the value_size bytes are available in skb.
> > So if we round up early at cgroup storage creation time there is no overrun.
>
> The verifier checks that src_ptr + value_size is accessible but the
> percpu's map->value_size is not rounded up to 8 bytes. If the percpu_map
> is created with 4 bytes value_size, the percpu_map->value_size will stay at 4
> and verifier only checks that src + 4 is accessible.

I meant that we can do:
map->value_size = round_up(map->value_size, 8);

there is no promise that value_size at creation time is equal to
the one that the verifier uses and what is reported back to user space.
We could do that for hashmap too. I think...