Re: Use of zero-length arrays in bcachefs structures inner fields

From: Mathieu Desnoyers
Date: Fri May 24 2024 - 12:03:43 EST


On 2024-05-24 11:35, Mathieu Desnoyers wrote:
[ Adding clang/llvm and KMSAN maintainers/reviewers in CC. ]

On 2024-05-24 11:28, Kent Overstreet wrote:
On Thu, May 23, 2024 at 01:53:42PM -0400, Mathieu Desnoyers wrote:
Hi Kent,

Looking around in the bcachefs code for possible causes of this KMSAN
bug report:

https://lore.kernel.org/lkml/000000000000fd5e7006191f78dc@xxxxxxxxxx/

I notice the following pattern in the bcachefs structures: zero-length
arrays members are inserted in structures (not always at the end),
seemingly to achieve a result similar to what could be done with a
union:

fs/bcachefs/bcachefs_format.h:

struct bkey_packed {
         __u64           _data[0];

         /* Size of combined key and value, in u64s */
         __u8            u64s;
[...]
};

likewise:

struct bkey_i {
         __u64                   _data[0];

         struct bkey     k;
         struct bch_val  v;
};

(and there are many more examples of this pattern in bcachefs)

AFAIK, the C11 standard states that array declarator constant expression

Effectively, we can verify that this code triggers an undefined behavior
with:

#include <stdio.h>

struct z {
         int x[0];
         int y;
         int z;
} __attribute__((packed));

int main(void)
{
         struct z a;

         a.y = 1;
         printf("%d\n", a.x[0]);
}
delimited by [ ] shall have a value greater than zero.

Yet another example of the C people going absolutely nutty with
everything being undefined. Look, this isn't ok, we need to get work
done, and I've already wasted entirely too much time on ZLA vs. flex
array member nonsense.

There's a bunch of legit uses for zero length arrays, and your example,
where we're not even _assigning_ to x, is just batshit. Someone needs to
get his head examined.

Notice how a.y is first set to 1, then a.x[0] is loaded, expecting to
alias with a.y.

This is the same aliasing pattern found in bcachefs, for instance here:

bcachefs_format.h:

struct jset {
[...]
__u8 encrypted_start[0];

__le16 _read_clock; /* no longer used */
__le16 _write_clock;

/* Sequence number of oldest dirty journal entry */
__le64 last_seq;


struct jset_entry start[0];
__u64 _data[];
} __packed __aligned(8);

where struct jset last_seq field is set by jset_validate():

jset->last_seq = jset->seq;

and where journal_read_bucket() uses the encrypted_start member as input:

ret = bch2_encrypt(c, JSET_CSUM_TYPE(j), journal_nonce(j),
j->encrypted_start,
vstruct_end(j) - (void *) j->encrypted_start);

Regards,

Mathieu



So I wonder if the issue reported by KMSAN could be caused by this
pattern ?

Possibly; the KMSAN errors I've been looking at do look suspicious. But
it sounds like we need a real fix that involves defining proper
semantics, not compiler folks giving up and saying 'aiee!'.

IOW, clang/KMSAN are broken if they simply choke on a zero length array
being present.


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com