On Tue, Nov 28, 2023 at 10:13:50AM -0600, Daniel Xu wrote:
On Mon, Nov 27, 2023 at 08:06:01PM -0800, Yonghong Song wrote:Wikipedia [0] also claims this:
On 11/27/23 7:01 PM, Daniel Xu wrote:I was reading this thread:
On Mon, Nov 27, 2023 at 02:45:11PM -0600, Daniel Xu wrote:Any reference for this (exact layout of bitfields is compiler dependent)?
On Sun, Nov 26, 2023 at 09:53:04PM -0800, Yonghong Song wrote:[...]
On 11/27/23 12:44 AM, Yonghong Song wrote:Ok, I will do byte-level rewrite for next revision.
On 11/26/23 8:52 PM, Eduard Zingerman wrote:or preserve_static_offset to clearly mean to undo bitfield CORE ...
On Sun, 2023-11-26 at 18:04 -0600, Daniel Xu wrote:I didn't test it. But from high level it should work.
[...]
Well, the contraption below passes verification, tunnel selftestTbh I'm not sure. This test passes with preserve_static_offsetIs there a reason to prefer fixing in compiler? I'm not opposed to it,
because it suppresses preserve_access_index. In general clang
translates bitfield access to a set of IR statements like:
C:
struct foo {
unsigned _;
unsigned a:1;
...
};
... foo->a ...
IR:
%a = getelementptr inbounds %struct.foo, ptr %0, i32 0, i32 1
%bf.load = load i8, ptr %a, align 4
%bf.clear = and i8 %bf.load, 1
%bf.cast = zext i8 %bf.clear to i32
With preserve_static_offset the getelementptr+load are replaced by a
single statement which is preserved as-is till code generation,
thus load with align 4 is preserved.
On the other hand, I'm not sure that clang guarantees that load or
stores used for bitfield access would be always aligned according to
verifier expectations.
I think we should check if there are some clang knobs that prevent
generation of unaligned memory access. I'll take a look.
but the downside to compiler fix is it takes years to propagate and
sprinkles ifdefs into the code.
Would it be possible to have an analogue of BPF_CORE_READ_BITFIELD()?
appears to work. I might have messed up some shifts in the macro,
though.
Still, if clang would peek unlucky BYTE_{OFFSET,SIZE} for a particularclang should pick a sensible BYTE_SIZE/BYTE_OFFSET to meet
field access might be unaligned.
alignment requirement. This is also required for BPF_CORE_READ_BITFIELD.
---Use asm volatile("" : "+r"(p)) ?
diff --git a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c
b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c
index 3065a716544d..41cd913ac7ff 100644
--- a/tools/testing/selftests/bpf/progs/test_tunnel_kern.c
+++ b/tools/testing/selftests/bpf/progs/test_tunnel_kern.c
@@ -9,6 +9,7 @@
#include "vmlinux.h"
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_endian.h>
+#include <bpf/bpf_core_read.h>
#include "bpf_kfuncs.h"
#include "bpf_tracing_net.h"
@@ -144,6 +145,38 @@ int ip6gretap_get_tunnel(struct __sk_buff *skb)
return TC_ACT_OK;
}
+#define BPF_CORE_WRITE_BITFIELD(s, field, new_val) ({ \
+ void *p = (void *)s + __CORE_RELO(s, field, BYTE_OFFSET); \
+ unsigned byte_size = __CORE_RELO(s, field, BYTE_SIZE); \
+ unsigned lshift = __CORE_RELO(s, field, LSHIFT_U64); \
+ unsigned rshift = __CORE_RELO(s, field, RSHIFT_U64); \
+ unsigned bit_size = (rshift - lshift); \
+ unsigned long long nval, val, hi, lo; \
+ \
+ asm volatile("" : "=r"(p) : "0"(p)); \
+ \I think this should be put in libbpf public header files but not sure
+ switch (byte_size) { \
+ case 1: val = *(unsigned char *)p; break; \
+ case 2: val = *(unsigned short *)p; break; \
+ case 4: val = *(unsigned int *)p; break; \
+ case 8: val = *(unsigned long long *)p; break; \
+ } \
+ hi = val >> (bit_size + rshift); \
+ hi <<= bit_size + rshift; \
+ lo = val << (bit_size + lshift); \
+ lo >>= bit_size + lshift; \
+ nval = new_val; \
+ nval <<= lshift; \
+ nval >>= rshift; \
+ val = hi | nval | lo; \
+ switch (byte_size) { \
+ case 1: *(unsigned char *)p = val; break; \
+ case 2: *(unsigned short *)p = val; break; \
+ case 4: *(unsigned int *)p = val; break; \
+ case 8: *(unsigned long long *)p = val; break; \
+ } \
+})
where to put it. bpf_core_read.h although it is core write?
But on the other hand, this is a uapi struct bitfield write,
strictly speaking, CORE write is really unnecessary here. It
would be great if we can relieve users from dealing with
such unnecessary CORE writes. In that sense, for this particular
case, I would prefer rewriting the code by using byte-level
stores...
This patch seems to work: https://pastes.dxuuu.xyz/0glrf9 .
But I don't think it's very pretty. Also I'm seeing on the internet that
people are saying the exact layout of bitfields is compiler dependent.
So I am wondering if these byte sized writes are correct. For thatOne thing for sure is memory layout of bitfields should be the same
matter, I am wondering how the GCC generated bitfield accesses line up
with clang generated BPF bytecode. Or why uapi contains a bitfield.
for both clang and gcc as it is determined by C standard. Register
representation and how to manipulate could be different for different
compilers.
https://github.com/Lora-net/LoRaMac-node/issues/697. It's obviously not
authoritative, but they sure sound confident!
I think I've also heard it before a long time ago when I was working on
adding bitfield support to bpftrace.
The layout of bit fields in a C struct is
implementation-defined. For behavior that remains predictable
across compilers, it may be preferable to emulate bit fields
with a primitive and bit operators:
[0]: https://en.wikipedia.org/wiki/Bit_field#C_programming_language