Re: Question related to ( commit 9f691549f76d "bpf: fix struct htab_elem layout" )

From: Etienne Martineau
Date: Mon Aug 30 2021 - 13:41:16 EST


On Mon, Aug 30, 2021 at 12:39 PM Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
>
> On Mon, Aug 30, 2021 at 7:17 AM Etienne Martineau <etmartin101@xxxxxxxxx> wrote:
> >
> > Hi,
> >
> > I've been staring at this commit for some time and I wonder what were the
> > symptoms when the issue was reproduced?
> > "The bug was discovered by manual code analysis and reproducible
> > only with explicit udelay() in lookup_elem_raw()."
> >
> > I tried various stress test + timing combinations in lookup_elem_raw() but no
> > luck.
>
> That fix was a long time ago :)
> afair the issue will not look like a crash, but rather an element
> will not be found.
> That's what lookup_nulls_elem_raw() is fixing.

Under that same scenario I wonder if it's also possible to have a
messed up element somehow?

>
> > I believe that one of our production boxes ran into that issue lately with a GPF
> > in the area of htab_map_lookup_elem(). The crash was seen on an outdated
> > 4.9 stable.
>
> Would be great if you can reproduce it on the latest kernel.

We have another deployment on 5.4 stable running the same bpf code so
will let you know.