Re: [REGRESSION][BISECTED] erroneous buffer overflow detected in bch2_xattr_validate

From: Jan Hendrik Farr
Date: Thu Oct 03 2024 - 17:48:30 EST


On 03 14:28:01, Kees Cook wrote:
> On Thu, Oct 03, 2024 at 05:17:08PM +0200, Jan Hendrik Farr wrote:
> > gcc currently says that the __bdos of struct containing a flexible array
> > member is:
> >
> > sizeof(<whole struct>) + sizeof(<flexible array element>) * <count>
> >
> > clang however does the following:
> >
> > max(sizeof(<whole struct>), offsetof(<flexible array member>) + sizeof(<flexible array element>) * <count>)
>
> Clang's calculation seems very wrong. I would expect it to match GCC's.
>

I was on the very same train of thought, but I have since changed my
mind a bit. A struct containing a flexible array member can be allocated in
two ways:

(1):

struct posix_acl *acl = malloc(sizeof(struct posix_acl) + sizeof(struct posix_acl_entry) * 1);
acl.a_count = 1;

or (2):

struct posix_acl *acl = malloc(offsetof(struct posix_acl, a_entries) + sizeof(struct posix_acl_entry) * 1);
acl.a_count = 1;

Both are valid ways to allocate it. __bdos does not know which of these
methods was used to allocate the struct whose size it has to determine,
so it's giving the lower bound that doesn't include the (potential)
padding at the end.

So it comes down to false positives vs false negatives...
More details here:
https://github.com/llvm/llvm-project/pull/111015

Clangs current behavior would essentially force kernel code to always
assume option (2) is used. So

struct posix_acl *
posix_acl_clone(const struct posix_acl *acl, gfp_t flags)
{
struct posix_acl *clone = NULL;

if (acl) {
int size = sizeof(struct posix_acl) + acl->a_count *
sizeof(struct posix_acl_entry);
clone = kmemdup(acl, size, flags);
if (clone)
refcount_set(&clone->a_refcount, 1);
}
return clone;
}
EXPORT_SYMBOL_GPL(posix_acl_clone);

from linux/fs/posix_acl.c would have to turn into something like:

struct posix_acl *
posix_acl_clone(const struct posix_acl *acl, gfp_t flags)
{
struct posix_acl *clone = NULL;

if (acl) {
int size = offsetof(struct posix_acl, a_entries) + acl->a_count *
sizeof(struct posix_acl_entry);
clone = kmemdup(acl, size, flags);
if (clone)
refcount_set(&clone->a_refcount, 1);
}
return clone;
}
EXPORT_SYMBOL_GPL(posix_acl_clone);

Which is actually safer, because can you actually be sure this posix_acl
wasn't allocated using method (2)?


After looking at the assembly produced by gcc more, it actually looks
like it's using the allocation size if it's known in the current context
(for example if the struct was just malloced in the same function)
and otherwise returns INT_MAX for the __bdos of a struct containing a
flexible array member. It's only returning the size based on the
__counted_by attribute of you ask it for the __bdos of the flexible
array member itself.