Re: Use of zero-length arrays in bcachefs structures inner fields

From: Alexander Potapenko
Date: Tue May 28 2024 - 07:37:02 EST


On Fri, May 24, 2024 at 7:30 PM Kent Overstreet
<kent.overstreet@xxxxxxxxx> wrote:
>
> On Fri, May 24, 2024 at 12:04:11PM -0400, Mathieu Desnoyers wrote:
> > On 2024-05-24 11:35, Mathieu Desnoyers wrote:
> > > [ Adding clang/llvm and KMSAN maintainers/reviewers in CC. ]
> > >
> > > On 2024-05-24 11:28, Kent Overstreet wrote:
> > > > On Thu, May 23, 2024 at 01:53:42PM -0400, Mathieu Desnoyers wrote:
> > > > > Hi Kent,
> > > > >
> > > > > Looking around in the bcachefs code for possible causes of this KMSAN
> > > > > bug report:
> > > > >
> > > > > https://lore.kernel.org/lkml/000000000000fd5e7006191f78dc@xxxxxxxxxx/
> > > > >
> > > > > I notice the following pattern in the bcachefs structures: zero-length
> > > > > arrays members are inserted in structures (not always at the end),
> > > > > seemingly to achieve a result similar to what could be done with a
> > > > > union:
> > > > >
> > > > > fs/bcachefs/bcachefs_format.h:
> > > > >
> > > > > struct bkey_packed {
> > > > > __u64 _data[0];
> > > > >
> > > > > /* Size of combined key and value, in u64s */
> > > > > __u8 u64s;
> > > > > [...]
> > > > > };
> > > > >
> > > > > likewise:
> > > > >
> > > > > struct bkey_i {
> > > > > __u64 _data[0];
> > > > >
> > > > > struct bkey k;
> > > > > struct bch_val v;
> > > > > };

I took a glance at the LLVM IR for fs/bcachefs/bset.c, and it defines
struct bkey_packed and bkey_i as:

%struct.bkey_packed = type { [0 x i64], i8, i8, i8, [0 x i8], [37 x i8] }
%struct.bkey_i = type { [0 x i64], %struct.bkey, %struct.bch_val }

, which more or less looks as expected, so I don't think it could be
causing problems with KMSAN right now.
Moreover, there are cases in e.g. include/linux/skbuff.h where
zero-length arrays are used for the same purpose, and KMSAN handles
them just fine.

Yet I want to point out that even GCC discourages the use of
zero-length arrays in the middle of a struct:
https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html, so Clang is not
unique here.

Regarding the original KMSAN bug, as noted in
https://lore.kernel.org/all/0000000000009f9447061833d477@xxxxxxxxxx/T/,
we might be missing the event of copying data from the disk to
bcachefs structs.
I'd appreciate help from someone knowledgeable about how disk I/O is
implemented in the kernel.