Re: [bpf/tools] cd17d77705: kernel_selftests.bpf.test_sock_addr.sh.fail
From: Andrii Nakryiko
Date: Fri Jun 28 2019 - 14:14:34 EST
On Thu, Jun 27, 2019 at 7:38 PM Stanislav Fomichev <sdf@xxxxxxxxxxx> wrote:
>
> On 06/27, Andrii Nakryiko wrote:
> > On Thu, Jun 27, 2019 at 10:29 AM Stanislav Fomichev <sdf@xxxxxxxxxxx> wrote:
> > >
> > > On 06/27, Stanislav Fomichev wrote:
> > > > On 06/27, kernel test robot wrote:
> > > > > FYI, we noticed the following commit (built with gcc-7):
> > > > >
> > > > > commit: cd17d77705780e2270937fb3cbd2b985adab3edc ("bpf/tools: sync bpf.h")
> > > > > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
> > > > >
> > > > > in testcase: kernel_selftests
> > > > > with following parameters:
> > > > >
> > > > > group: kselftests-00
> > > > >
> > > > > test-description: The kernel contains a set of "self tests" under the tools/testing/selftests/ directory. These are intended to be small unit tests to exercise individual code paths in the kernel.
> > > > > test-url: https://www.kernel.org/doc/Documentation/kselftest.txt
> > > > >
> > > > >
> > > > > on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 8G
> > > > >
> > > > > caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
> > > > >
> > > > > # 55: (18) r1 = 0x100000000000000
> > > > > # ; ctx->user_ip6[2] = bpf_htonl(DST_REWRITE_IP6_2);
> > > > > # 57: (7b) *(u64 *)(r6 +16) = r1
> > > > > # invalid bpf_context access off=16 size=8
> > > > This looks like clang doing single u64 write for user_ip6[2] and
> > > > user_ip6[3] instead of two u32. I don't think we allow that.
> > > >
> > > > I've seen this a couple of times myself while playing with some
> > > > progs, but not sure what's the right way to 'fix' it.
> > > >
> > > Any thoughts about the patch below? Another way to "fix" it
> >
> > I'll give it a more thorough look a bit later, but see my comments below.
> >
> > > would be to mark context accesses 'volatile' in bpf progs, but that sounds
> > > a bit gross.
> > >
> > > diff --git a/include/linux/filter.h b/include/linux/filter.h
> > > index 43b45d6db36d..34a14c950e60 100644
> > > --- a/include/linux/filter.h
> > > +++ b/include/linux/filter.h
> > > @@ -746,6 +746,20 @@ bpf_ctx_narrow_access_ok(u32 off, u32 size, u32 size_default)
> > > return size <= size_default && (size & (size - 1)) == 0;
> > > }
> > >
> > > +static inline bool __bpf_ctx_wide_store_ok(u32 off, u32 size)
> >
> > It seems like bpf_ctx_wide_store_ok and __bpf_ctx_wide_store_ok are
> > used only inside net/core/filter.c, why declaring them in header file?
> I wanted it to be next to bpf_ctx_narrow_access_ok which does the
> reverse operation for reads.
Ah, ok, I see that bpf_ctx_narrow_access_ok is used in
kernel/bpf/cgroup.c as well and bpf_ctx_wide_store_ok might be useful
in some other contexts as well, let's keep it here.
>
> > > +{
> > > + /* u64 access is aligned and fits into the field size */
> > > + return off % sizeof(__u64) == 0 && off + sizeof(__u64) <= size;
> > > +}
> > > +
> > > +#define bpf_ctx_wide_store_ok(off, size, type, field) \
> > > + (size == sizeof(__u64) && \
> > > + off >= offsetof(type, field) && \
> > > + off < offsetofend(type, field) ? \
> > > + __bpf_ctx_wide_store_ok(off - offsetof(type, field), \
> > > + FIELD_SIZEOF(type, field)) : 0)
This would be sufficient, right?
#define bpf_ctx_wide_store_ok(off, size, type, field) \
size == sizeof(__u64) && \
off >= offsetof(type, field) && \
off + size <= offsetofend(type, field) && \
off % sizeof(__u64) == 0
> >
> > Why do you need ternary operator instead of just a chain of &&s?
> Good point. I didn't spend too much time on the patch tbh :-)
> If it looks good in general, I can add proper tests and do a
> proper submition, this patch is just to get the discussion started.
Consider it started. :) Talking with Yonghong about preventing this
from happening in the first place in Clang, it seems like that would
be harder and more cumbersome than supporting in BPF verifier. So
please go ahead and submit a proper patch.
>
> > It also seems like you can avoid macro and use plain function if
> > instead of providing (type, field) you provide values of offsetof and
> > offsetofend (offsetofend - offsetof should equal FIELD_SIZEOF(type,
> > field), shouldn't it?).
> But then I'd have to copy-paste the args of offsetof/offsetofend at
> the caller, right? I wanted the caller to be clean and simple.
Yeah, that's a bit verbose, I agree. I don't mind macro, so no worries.
>
> > > #define bpf_classic_proglen(fprog) (fprog->len * sizeof(fprog->filter[0]))
> > >
> > > static inline void bpf_prog_lock_ro(struct bpf_prog *fp)
> > > diff --git a/net/core/filter.c b/net/core/filter.c
> > > index 2014d76e0d2a..2d3787a439ae 100644
> > > --- a/net/core/filter.c
> > > +++ b/net/core/filter.c
> > > @@ -6849,6 +6849,16 @@ static bool sock_addr_is_valid_access(int off, int size,
> > > if (!bpf_ctx_narrow_access_ok(off, size, size_default))
> > > return false;
> > > } else {
> > > + if (bpf_ctx_wide_store_ok(off, size,
> > > + struct bpf_sock_addr,
> > > + user_ip6))
> > > + return true;
> > > +
> > > + if (bpf_ctx_wide_store_ok(off, size,
> > > + struct bpf_sock_addr,
> > > + msg_src_ip6))
> > > + return true;
> > > +
> > > if (size != size_default)
> > > return false;
> > > }
> > > @@ -7689,9 +7699,6 @@ static u32 xdp_convert_ctx_access(enum bpf_access_type type,
> > > /* SOCK_ADDR_STORE_NESTED_FIELD_OFF() has semantic similar to
> > > * SOCK_ADDR_LOAD_NESTED_FIELD_SIZE_OFF() but for store operation.
> > > *
> > > - * It doesn't support SIZE argument though since narrow stores are not
> > > - * supported for now.
> > > - *
> > > * In addition it uses Temporary Field TF (member of struct S) as the 3rd
> > > * "register" since two registers available in convert_ctx_access are not
> > > * enough: we can't override neither SRC, since it contains value to store, nor
> > > @@ -7699,7 +7706,7 @@ static u32 xdp_convert_ctx_access(enum bpf_access_type type,
> > > * instructions. But we need a temporary place to save pointer to nested
> > > * structure whose field we want to store to.
> > > */
> > > -#define SOCK_ADDR_STORE_NESTED_FIELD_OFF(S, NS, F, NF, OFF, TF) \
> > > +#define SOCK_ADDR_STORE_NESTED_FIELD_OFF(S, NS, F, NF, SIZE, OFF, TF) \
> > > do { \
> > > int tmp_reg = BPF_REG_9; \
> > > if (si->src_reg == tmp_reg || si->dst_reg == tmp_reg) \
> > > @@ -7710,8 +7717,7 @@ static u32 xdp_convert_ctx_access(enum bpf_access_type type,
> > > offsetof(S, TF)); \
> > > *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(S, F), tmp_reg, \
> > > si->dst_reg, offsetof(S, F)); \
> > > - *insn++ = BPF_STX_MEM( \
> > > - BPF_FIELD_SIZEOF(NS, NF), tmp_reg, si->src_reg, \
> > > + *insn++ = BPF_STX_MEM(SIZE, tmp_reg, si->src_reg, \
> > > bpf_target_off(NS, NF, FIELD_SIZEOF(NS, NF), \
> > > target_size) \
> > > + OFF); \
> > > @@ -7723,8 +7729,8 @@ static u32 xdp_convert_ctx_access(enum bpf_access_type type,
> > > TF) \
> > > do { \
> > > if (type == BPF_WRITE) { \
> > > - SOCK_ADDR_STORE_NESTED_FIELD_OFF(S, NS, F, NF, OFF, \
> > > - TF); \
> > > + SOCK_ADDR_STORE_NESTED_FIELD_OFF(S, NS, F, NF, SIZE, \
> > > + OFF, TF); \
> > > } else { \
> > > SOCK_ADDR_LOAD_NESTED_FIELD_SIZE_OFF( \
> > > S, NS, F, NF, SIZE, OFF); \