Re: [GIT PULL] percpu fix for v5.9-rc6

From: Gustavo A. R. Silva
Date: Fri Sep 18 2020 - 15:28:53 EST

On Fri, Sep 18, 2020 at 10:23:54AM -0700, Linus Torvalds wrote:
> On Fri, Sep 18, 2020 at 9:17 AM Gustavo A. R. Silva
> <gustavoars@xxxxxxxxxx> wrote:
> >
> > This bug could have been prevented by either adopting better
> > coding practices or through the use[3] of the recent struct_size() helper.
> Well, my unspoken point was that coding practices are just
> theoretical. Coding practices don't help - actual *checking* of them
> helps.

Yep; I agree that the best approach is to find a way to enforce such
practices. :)

> I realize that structures with flexible-array member are allowed to
> use sizeof() in standard C, but if we want to make sure this doesn't
> happen, we would need to have a stricter model than that. But a quick
> google didn't find any flag to enable such a stricter mode.

OK. It seems that we are talking about two different things here. One thing
is to apply sizeof() to a structure that contains a flexible-array member.
And the other thing is to apply sizeof() to a flexible array. The former
is allowed, the latter is wrong and we already get a build error when that

Applying sizeof() to a structure containing a flex-array member is allowed,
and the result is the size of the structure excluding the size of the
flexible-array member type. In this regard, using a zero-length array has
the same effect as using a flexible-array member.

Now, if you are trying to make the case for not allowing the application
of sizeof() to a structure that contains a flexible-array member because
that could be prone to error, then this is new and we need to think about
the implications and see if we can move in that direction.

> I guess a sparse warning would work, but sparse already has too many
> warnings and as a result most people don't care - even if they were to
> run sparse in the first place.

There are ongoing efforts to warn about the use of zero-length and one-element
arrays through the use of Coccinelle. But again, you might be talking
about a different thing...

> Is there some gcc option that I didn't find to help find any questionable cases?

If the questionable case is the application of sizeof() to a flex-array
member or a flex-array member not occuring last in the containing structure,
then yes, GCC already generates a build error for both cases. And that's
what we want, see at the bottom...

> Because if we have a coding practice that you should use
> 'struct_size()', then we should also have a way to _verify_ that.

struct_size() should be used to defend against these sorts of bugs[1] as I
explained in the last email[2], also to prevent integer overflows. It's meant
to replace the following sorts of idioms:

sizeof(struct foo) + sizeof(struct boo) * COUNT
sizeof(*foo) + sizeof(*boo) * COUNT
sizeof(*foo) + sizeof(*foo->array) * COUNT
sizeof(*foo) + sizeof(foo->array[0]) * COUNT

and all the many variations...

But struct_size() by itself is not meant to help identify any kind of array
misuses in structures. struct_size() can be used with structures that contain
any kind of arrays (zero-length/one-element/flexible array).

> The whole - and really ONLY - point of using flexible arrays was that
> it would protect against these things. And as things are now, it
> simply doesn't. It's not an actual improvement over just using a
> zero-sized array.

It protects against the cases I have explained at the top and that I elaborate

> (Slightly related: copying a struct has the exact same issue. A
> flexible array is no better than a zero-sized array, and generates the
> same code and the same lack of any warnings, afaik).

An important difference is that the use of a flexible-array member allows
the compiler to generate errors when the flexible array does not occur last
in the structure, which helps to prevent some kind of undefined behavior bugs[3]
from being inadvertently introduced to the codebase. It also allows the compiler
to correctly analyze array sizes (via sizeof(), CONFIG_FORTIFY_SOURCE, and
CONFIG_UBSAN_BOUNDS). For instance, there is no mechanism that warns us that
the following application of the sizeof() operator to a zero-length array always
results in zero:

struct something {
size_t count;
struct foo items[0];

struct something *instance;

instance = kmalloc(struct_size(instance, items, count), GFP_KERNEL);
instance->count = count;

size = sizeof(instance->items) * instance->count;
memcpy(instance->items, source, size);

At the last line of code above, size turns out to be zero, when one might have
thought it represents the total size in bytes of the dynamic memory recently
allocated for the trailing array items. Here are a couple examples of this
issue[4][5]. Instead, flexible array members have incomplete type, and so the
sizeof() operator may not be applied, so any misuse of such operators will be
immediately noticed at build time.

We have documented all the above and more here[6], and there is an update waiting
to land in mainline here[7].

But again, you might be more concerned about applying sizeof() to a structure
containing a flexible-array member in the first place, rather than applying it
to the member alone...?