Re: Use of zero-length arrays in bcachefs structures inner fields

From: Kent Overstreet
Date: Tue May 28 2024 - 11:04:53 EST


On Tue, May 28, 2024 at 01:36:11PM +0200, Alexander Potapenko wrote:
> On Fri, May 24, 2024 at 7:30 PM Kent Overstreet
> <kent.overstreet@xxxxxxxxx> wrote:
> >
> > On Fri, May 24, 2024 at 12:04:11PM -0400, Mathieu Desnoyers wrote:
> > > On 2024-05-24 11:35, Mathieu Desnoyers wrote:
> > > > [ Adding clang/llvm and KMSAN maintainers/reviewers in CC. ]
> > > >
> > > > On 2024-05-24 11:28, Kent Overstreet wrote:
> > > > > On Thu, May 23, 2024 at 01:53:42PM -0400, Mathieu Desnoyers wrote:
> > > > > > Hi Kent,
> > > > > >
> > > > > > Looking around in the bcachefs code for possible causes of this KMSAN
> > > > > > bug report:
> > > > > >
> > > > > > https://lore.kernel.org/lkml/000000000000fd5e7006191f78dc@xxxxxxxxxx/
> > > > > >
> > > > > > I notice the following pattern in the bcachefs structures: zero-length
> > > > > > arrays members are inserted in structures (not always at the end),
> > > > > > seemingly to achieve a result similar to what could be done with a
> > > > > > union:
> > > > > >
> > > > > > fs/bcachefs/bcachefs_format.h:
> > > > > >
> > > > > > struct bkey_packed {
> > > > > > __u64 _data[0];
> > > > > >
> > > > > > /* Size of combined key and value, in u64s */
> > > > > > __u8 u64s;
> > > > > > [...]
> > > > > > };
> > > > > >
> > > > > > likewise:
> > > > > >
> > > > > > struct bkey_i {
> > > > > > __u64 _data[0];
> > > > > >
> > > > > > struct bkey k;
> > > > > > struct bch_val v;
> > > > > > };
>
> I took a glance at the LLVM IR for fs/bcachefs/bset.c, and it defines
> struct bkey_packed and bkey_i as:
>
> %struct.bkey_packed = type { [0 x i64], i8, i8, i8, [0 x i8], [37 x i8] }
> %struct.bkey_i = type { [0 x i64], %struct.bkey, %struct.bch_val }
>
> , which more or less looks as expected, so I don't think it could be
> causing problems with KMSAN right now.
> Moreover, there are cases in e.g. include/linux/skbuff.h where
> zero-length arrays are used for the same purpose, and KMSAN handles
> them just fine.
>
> Yet I want to point out that even GCC discourages the use of
> zero-length arrays in the middle of a struct:
> https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html, so Clang is not
> unique here.
>
> Regarding the original KMSAN bug, as noted in
> https://lore.kernel.org/all/0000000000009f9447061833d477@xxxxxxxxxx/T/,
> we might be missing the event of copying data from the disk to
> bcachefs structs.
> I'd appreciate help from someone knowledgeable about how disk I/O is
> implemented in the kernel.

If that was missing I'd expect everything to be breaking. What's the
helper that marks memory as initialized?